Introduction to Linux - Enjoying and Being Productive on Linux - Running Linux, 5th Edition (2009)

Running Linux, 5th Edition (2009)

Part I. Enjoying and Being Productive on Linux

This part of the book introduces Linux and brings you to the point where you can do all the standard activities people do on other systems: emailing, web surfing, playing games, watching videos, and so on.

Chapter 2 is worth reading even if you plan to install Linux from an easy-to-use distribution. Fundamental considerations, such as how much disk space to devote to different parts of your system, indicate that some planning lies behind every installation.

The vast majority of Linux installations go well and make the features discussed in this part of the book available to system users. If you have trouble, though, the more advanced material in other parts of the book can help you, along with online documentation and more specialized texts.

Chapter 1: Introduction to Linux

Chapter 2: Preinstallation and Installation

Chapter 3: Desktop Environments

Chapter 4: Basic Unix Commands and Concepts

Chapter 5: Web Browsers and Instant Messaging

Chapter 6: Electronic Mail Clients

Chapter 7: Games

Chapter 8: Office Suites and Personal Productivity

Chapter 9: Multimedia

Chapter 1. Introduction to Linux

Welcome to Running Linux, Version 5! When we wrote the first edition of this book, Linux had barely arrived on the scene. Our task seemed simple: help readers learn the basics of a new operating system that required a pretty fixed and predictable set of tasks. Few if any observers expected Linux would become a best-of-breed operating system, supported by the vast majority of hardware and software manufacturers on the planet. Who would have known that Linux would grow from a small user base of 30,000 people in 1995 to hundreds of millions only 10 years later? People use Linux everywhere on the planet and in some cases in outer space and under the ocean.

To the casual observer, Linux looks like a fairly simple personal computer desktop built on the same chassis as any IBM PC. People use Linux to browse the Internet, exchange email, listen to music, watch videos, and instant message their friends and coworkers. Students and office workers create documents with word processors, perform numerous tasks with spreadsheet programs, and make slide presentations.

The same Linux operating system also drives sonar arrays in nuclear submarines, indexes every document on the Internet, unifies large corporate data centers, runs nearly 70% of all web sites in the world, records your television programs, works in your cellular phone, and runs the switches that allow you to connect with your friends and family anywhere on the globe. Linux runs systems on the international space station as well as the shuttles that take astronauts there. It protects you from spam and computer viruses on numerous routers and back-end systems.

You can benefit directly from installing Linux on a system at home, at school, or in the office, and having all that power at your fingertips. Not only can you carry on everyday surfing and office work, but you can also learn how to write database queries, administer a web server, filter mail for spam and viruses, automate your environment through scripting languages, access web services, and participate in the myriad of other cutting-edge activities provided by modern computing.

How does Linux do all those things? Linux distributions harvest vast amounts of diverse technology, especially new and innovative developments in hardware. Developers have access to all the code that makes up the operating system. Although many people consider Linux the largest cooperative software development project in human history, Linux developers don't need to even know each other. If someone wants to write a software application, all he has to do is download the Linux code or visit its documentation site. If you started counting people who have contributed to the development of Linux and its associated projects, you would see hundreds of thousands of individuals.

Linux and open source software developers come from many walks of life. Major computer vendors such as IBM, HP, Novell, Red Hat, Sun, Dell, and others pay portions of their staffs to work on Linux. Universities around the globe sponsor projects and foundations that contribute to Linux. The U.S. Department of Defense, NASA, and the National Security Agency have paid for numerous pieces of the Linux operating system. Developing countries such as China, Brazil, Malaysia, South Africa, and Viet Nam, to mention a few, have added to the Linux base. Industrial giants such as Germany, Australia, Japan, the United Kingdom, and others have also made their presence felt. But in the very midst of those giants, many individuals such as you and me have also contributed to Linux.

During the 1990s, Linux generated more excitement in the computer field than any other development since the advent of microprocessor technology. Linux rejuvenated a dying technology sector following the fall of the dot-com boom in the spring of 2001. Today, Linux has surpassed the expectations of informed observers worldwide, including the authors of this book.

Early on, Linux inspired and captured the loyalty of its users. Technologists interested in the server side of the Internet needed to become familiar with the operating systems that ran web sites, domain name services, and email and service providers. Traditional software manufacturers priced their systems out of the range of those wanting to gain webmaster-type skills. Many people viewed Linux as a godsend because you could download it for free and gain the skills necessary to become a webmaster or system administrator while working on relatively low-cost hardware.

Originally, people saw Linux as simply an operating system kernel, offering the basic services of process scheduling, virtual memory, file management, and handling of hardware peripherals such as hard drives, DVDs, printers, terminals, and so forth. Other Internet operating systems belonged to the Unix family, which became available for commercial sale only after the breakup of AT&T and the Bell Operating Systems.

To skirt the legal issues surrounding AT&T's Unix, the Free Software Foundation (FSF) created a plethora of applications that performed many of the functions of basic Unix while using totally original FSF code instead of code produced by Bell Labs. This collection of FSF software was called GNU. To become a complete operating system, however, FSF needed a kernel. Although their own efforts in that area stalled, an operating system fitting the bill arose unexpectedly from efforts by a student at the University of Helsinki in Finland: Linus Torvalds.

People now use the term "Linux" to refer to the complete system—the kernel along with the many applications that it runs: a complete development and work environment including compilers, editors, graphical interfaces, text processors, games, and more. FSF proponents ask that this broader collection of software be known as "GNU/Linux."

About This Book

This book provides an overview and guide to Linux as a desktop and a back-office system. We present information on topics to satisfy novices and wizards alike. This book should provide sufficient material for almost anyone to choose the type of installation they want and get the most out of it. Instead of covering many of the volatile technical details—those things that tend to change with Linux's rapid development—we give you the information that helps you over the bumps as you take your first steps with popular distributions, as well as background you will need if you plan to go onto more advanced Linux topics such as web services, federated identity management, high-performance computing, and so on.

We geared this book for those people who want to understand the power that Linux can provide. Rather than provide minimal information, we help you see how the different parts of the Linux system work, so you can customize, configure, and troubleshoot the system on your own. Linux is not difficult to install and use. Many people consider it easier and faster to set up than Microsoft Windows. However, as with any commercial operating system, some black magic exists, and you will find this book useful if you plan to go beyond desktop Linux and use web services or network management services.

In this book, we cover the following topics:

§ The design and philosophy of the Linux operating system, and what it can do for you.

§ Information on what you need to run Linux, including suggestions on hardware platforms and how to configure the operating system depending on its specified role (e.g., desktop, web server, database and/or application server).

§ How to obtain and install Linux. We cover the Red Hat, SUSE, and Debian distributions in more detail than others, but the information is useful in understanding just about any distribution.

§ An introduction, for new users, to the original Linux/Unix system philosophy, including the most important commands and concepts still in use.

§ Personal productivity through slick and powerful office suites, image manipulation, and financial accounting.

§ The care and feeding of the Linux system, including system administration and maintenance, upgrading the system, and how to fix things when they don't work.

§ Expanding the basic Linux system and desktop environments with power tools for the technically inclined.

§ The Linux programming environment. The tools of the trade for programming and developing software on the Linux system.

§ Using Linux for telecommunications and networking, including the basics of TCP/IP configuration, PPP for Internet connectivity over a modem, ISDN configuration, ADSL, cable, email, news, and web access—we even show how to configure a Linux system as a web and database server.

§ Linux for fun: audio, video, and games.

Many things exist that we'd love to show you how to do with Linux. Unfortunately, to cover them all, this book would be the size of the unabridged Oxford English Dictionary and would be impossible for anyone (let alone the authors) to maintain. Instead we've included the most salient and interesting aspects of the system and show you how to find out more.

Although much of the discussion in this book is not overly technical, you'll find it easier to navigate if you have some experience with the command line and the editing of simple text files. For those who don't have such experience, we have included a short tutorial in Chapter 4. Part 2 of the book is an exploration of system administration that can help even seasoned technicians run Linux in a server mode.

If you are new to Linux and want more system-oriented information, you'll want to pick up an additional guide to command-line basics. We don't dwell for long on the fundamentals, preferring instead to skip to the fun parts of the system. At any rate, although this book should be enough to get you functional and even seasoned in the use of Linux, you may have requirements that will take you into specialized areas. See Appendix A for a list of sources of information.

Who's Using Linux?

Application developers, system administrators, network providers, kernel hackers, students, and multimedia authors are just a few of the categories of people who find that Linux has a particular charm.

Programmers are increasingly using Linux because of its extensibility and low cost—they can pick up a complete programming environment for free and run it on inexpensive PC hardware—and because Linux offers a great development platform for portable programs. In addition to the original FSF tools, Linux can utilize a number of development environments that have surfaced over the last three years, such as Eclipse (http://eclipse.org). Eclipse is quite a phenomenon: a tribute to both the creativity of the open source community and the fertility of a collaboration between an open source community and a major vendor (Eclipse was originally developed and released by IBM). It is an open source community focused on providing an extensible development platform and application frameworks for building software.

Eclipse's tools and frameworks span the software development life cycle, including support for modeling; language development environments for Java?, C/C++, and other languages; testing and performance; business intelligence; rich client applications; and embedded development. A large, vibrant ecosystem of major technology vendors, innovative startups, universities, and research institutions and individuals extend, complement, and support the Eclipse platform.

Networking is one of Linux's strengths. Linux has been adopted by people who run large networks because of its simplicity of management, performance, and low cost. Many Internet sites make use of Linux to drive large web servers, e-commerce applications, search engines, and more. Linux is easy to merge into a corporate or academic network because it supports common networking standards. These include both old stand-bys, such as the Network File System (NFS) and Network Information Service (NIS), and more prominent systems used in modern businesses, such as Microsoft file sharing (CIFS and related protocols) and Lightweight Directory Access Protocol (LDAP). Linux makes it easy to share files, support remote logins, and run applications on other systems. A software suite called Samba allows a Linux machine to act as a Windows server in Active Directory environments. The combination of Linux and Samba for this purpose is faster (and less expensive) than running Windows Server 2003. In fact, given the ease with which Linux supports common networking activities—DHCP, the Domain Name System, Kerberos security, routing—it's hard to imagine a corporate networking task for which it's unsuited.

One of the most popular uses of Linux is in driving large enterprise applications, including web servers, databases, business-to-business systems, and e-commerce sites. Businesses have learned that Linux provides an inexpensive, efficient, and robust system capable of driving the most mission-critical applications.

As just one example among the many publicized each month, Cendant Travel Distribution Services put its Fares application on a Linux Enterprise Server with IBM xSeries and BladeCenter servers as the hardware platforms. The move reduced expenditures by 90% while achieving 99.999% availability and handling 300 to 400 transactions per second.

Linux's ease of customization—even down to the guts of the kernel—makes the system very attractive for companies that need to exercise control over the inner workings of the system. Linux supports a range of technologies that ensure timely disk access and resistance to failure, from RAID (a set of mechanisms that allow an array of disks to be treated as a single logical storage device) to the most sophisticated storage area networks. These greatly increase reliability and reduce the costs of meeting new regulatory demands that require the warehousing of data for as long as 30 years.

The combination of Linux, the Apache web server, the MySQL database engine, and the PHP scripting language is so common that it has its own acronym—LAMP. We cover LAMP in more detail in Chapter 25.

Kernel hackers were the first to come to Linux—in fact, the developers who helped Linus Torvalds create Linux are still a formidable community. The Linux kernel mailing lists see a great deal of activity, and it's the place to be if you want to stay on the bleeding edge of operating system design. If you're into tuning page replacement algorithms, twiddling network protocols, or optimizing buffer caches, Linux is a great choice. Linux is also good for learning about the internals of operating system design, and an increasing number of universities make use of Linux systems in advanced operating system courses.

Finally, Linux is becoming an exciting forum for multimedia because it's compatible with an enormous variety of hardware, including the majority of modern sound and video cards. Several programming environments, including the MESA 3D toolkit (a free OpenGL implementation), have been ported to Linux; OpenGL is introduced in "Introduction to OpenGL Programming" in Chapter 21. The GIMP (a free Adobe Photoshop work-alike) was originally developed under Linux, and is becoming the graphics manipulation and design tool of choice for many artists. Many movie production companies regularly use Linux as the workhorse for advanced special-effects rendering—the popular movies Titanic and The Matrix used "render farms" of Linux machines to do much of the heavy lifting.

Linux systems have traveled the high seas of the North Pacific, managing telecommunications and data analysis for oceanographic research vessels. Linux systems are used at research stations in Antarctica, and large "clusters" of Linux machines are used at many research facilities for complex scientific simulations ranging from star formation to earthquakes, and in Department of Energy laboratories helping to bring new sources of energy to everyone. On a more basic level, hospitals use Linux to maintain patient records and retrieve archives. The U.S. judiciary uses Linux to manage its entire infrastructure, from case management to accounting. Financial institutions use Linux for real-time trading of stocks, bonds, and other financial instruments. Linux has taken over the role that Unix used to play as the most reliable operating system.

System Features

Linux has surpassed the features found in implementations of Unix and Windows. With the changes offered by IBM's Power Architecture, for example, Linux provides functionality for commodity hardware normally only found on the most expensive mainframes. Additionally, the latest kernels include the structure of Security Enhanced Linux (SELinux) provided by the National Security Agency (http://www.nsa.gov/selinux). SELinux provides the most trusted computing environment available today.

Now add Linux's ability to provide virtualization at the kernel level. Through Xen (http://sourceforge.net/projects/xen), Linux can securely execute multiple virtual machines, each running its own operating system, on a single physical system. This allows enterprises to stop server sprawl and increase CPU utilization.

A Bag of Features

This section provides a nickel tour of Linux features.

Linux is a complete multitasking , multiuser operating system (as are all other versions of Unix). This means that many users can be logged onto the same machine at once, running multiple programs simultaneously. Linux also supports multiprocessor systems (such as dual-Pentium motherboards), with support for up to 32 processors in a system,[*] which is great for high-performance servers and scientific applications.

The Linux system is mostly compatible with a number of Unix standards (inasmuch as Unix has standards) on the source level, including IEEE POSIX.1, System V, and BSD features. Linux was developed with source portability in mind: therefore, you will probably find features in the Linux system that are shared across multiple Unix implementations. A great deal of free Unix software available on the Internet and elsewhere compiles on Linux out of the box.

If you have some Unix background, you may be interested in some other specific internal features of Linux, including POSIX job control (used by shells such as the C shell, csh, and bash), pseudoterminals (pty devices), and support for national or customized keyboards using dynamically loadable keyboard drivers. Linux also supports virtual consoles , which allow you to switch between multiple login sessions from the system console in text mode. Users of the screen program will find the Linux virtual console implementation familiar (although nearly all users make use of aGUI desktop instead).

Linux can quite happily coexist on a system that has other operating systems installed, such as Windows 95/98, Windows NT/2000/XP, Mac OS, and Unix-like operating systems such as the variants of BSD. The Linux bootloader (LILO ) and the GRand Unified Bootloader (GRUB ) allow you to select which operating system to start at boot time, and Linux is compatible with other bootloaders as well (such as the one found in Windows XP).

Linux can run on a wide range of CPU architectures, including the Intel x86 (the whole Pentium line), Itanium, SPARC/UltraSPARC, AMD 64 ("Hammer"), ARM, PA-RISC, Alpha, PowerPC, MIPS, m68k, and IBM 390 and zSeries mainframes. Linux has also been ported to a number of embedded processors, and stripped-down versions have been built for various PDAs, including the PalmPilot and Compaq iPaq. In the other direction, Linux is being considered for top-of-the-line computers as well. Hewlett-Packard has a supercomputer with Linux as the operating system. A large number of scalable clusters—supercomputers built from arrays of PCs—run Linux as well.

Linux supports various filesystem types for storing data. Some filesystems, such as the Second Extended Filesystem (ext2fs), have been developed specifically for Linux. Other Unix filesystem types, such as the Minix-1 and Xenix filesystems, are also supported. The Windows NTFS, VFAT(Windows 95/98), and FAT (MS-DOS) filesystems have been implemented as well, allowing you to access Windows files directly. Support is included for Macintosh, OS/2, and Amiga filesystems as well. The ISO 9660 CD-ROM filesystem type, which reads all standard formats of CD-ROMs, is also supported. We talk more about filesystems in Chapter 2 and Chapter 10.

Networking support is one of the greatest strengths of Linux, in terms of both functionality and performance. Linux provides a complete implementation of TCP/IP networking. This includes device drivers for many popular Ethernet cards, PPP and SLIP (allowing you to access a TCP/IPnetwork via a serial connection or modem), Parallel Line Internet Protocol (PLIP), and ADSL. Linux also supports the modern IPv6 protocol suite, and many other protocols, including DHCP, Appletalk, IRDA, DECnet, and even AX.25 for packet radio networks. The complete range ofTCP/IP clients and services is supported, such as FTP, Telnet, NNTP, and Simple Mail Transfer Protocol (SMTP), the Sun RPC protocols allowing NFS and NIS, and the Microsoft protocols allowing participation in a Microsoft domain. The Linux kernel includes complete network firewall support, allowing any Linux machine to screen network packets and prevent unauthorized access to an intranet, for example.

It is widely held that networking performance under Linux is superior to other operating systems. We talk more about networking in Chapter 13 and Part IV.

Kernel

The kernel is the guts of the operating system itself; it's the code that controls the interface between user programs and hardware devices, the scheduling of processes to achieve multitasking, and many other aspects of the system. The kernel is not a separate process running on the system. Instead, you can think of the kernel as a set of routines, constantly in memory, to which every process has access. Kernel routines can be called in a number of ways. One direct method to utilize the kernel is for a process to execute a system call, which is a function that causes the kernel to execute some code on behalf of the process. For example, the read system call will read data from a file descriptor. To the programmer, this looks like any other C function, but in actuality the code for read is contained within the kernel.

The Linux kernel is known as a monolithic kernel, in that all core functions and device drivers are part of the kernel proper. Some operating systems employ a microkernel architecture whereby device drivers and other components (such as filesystems and memory management code) are notpart of the kernel—rather, they are treated as independent services or regular user applications. There are advantages and disadvantages to both designs: the monolithic architecture is more common among Unix implementations and is the design employed by classic kernel designs, such as System V and BSD. Linux does support loadable device drivers (which can be loaded and unloaded from memory through user commands); this is covered in Chapter 18.

The Linux kernel on Intel platforms is developed to use the special protected-mode features of the Intel x86 processors (starting with the 80386 and moving on up to the current Pentium 4). In particular, Linux makes use of the protected-mode descriptor-based memory management paradigm and many of the other advanced features of these processors. Anyone familiar with x86 protected-mode programming knows that this chip was designed for a multitasking system such as Unix (the x86 was actually inspired by Multics). Linux exploits this functionality.

Like most modern operating systems, Linux is a multiprocessor operating system: it supports systems with more than one CPU on the motherboard. This feature allows different programs to run on different CPUs at the same time (or "in parallel"). Linux also supports threads, a common programming technique that allows a single program to create multiple "threads of control" that share data in memory. Linux supports several kernel-level and user-level thread packages, and Linux's kernel threads run on multiple CPUs, taking advantage of true hardware parallelism. The Linux kernel threads package is compliant with the POSIX 1003.1c standard.

The Linux kernel supports demand-paged loaded executables. That is, only those segments of a program that are actually used are read into memory from disk. Also, if multiple instances of a program are running at once, only one copy of the program code will be in memory. Executables use dynamically linked shared libraries, meaning that executables share common library code in a single library file found on disk. This allows executable files to occupy much less space on disk. This also means that a single copy of the library code is held in memory at one time, thus reducing overall memory usage. There are also statically linked libraries for those who wish to maintain "complete" executables without the need for shared libraries to be in place. Because Linux shared libraries are dynamically linked at runtime, programmers can replace modules of the libraries with their own routines.

In order to make the best use of the system's memory, Linux implements so-called virtual memory with disk paging. That is, a certain amount of swap space [*] can be allocated on disk. When applications require more physical memory than is actually installed in the machine, it will swap inactive pages of memory out to disk. (A page is simply the unit of memory allocation used by the operating system; on most architectures, it's equivalent to 4 KB.) When those pages are accessed again, they will be read from disk back into main memory. This feature allows the system to run larger applications and support more users at once. Of course, swap is no substitute for physical RAM; it's much slower to read pages from disk than from memory.

The Linux kernel keeps portions of recently accessed files in memory, to avoid accessing the (relatively slow) disk any more than necessary. The kernel uses all the free memory in the system for caching disk accesses, so when the system is lightly loaded a large number of files can be accessed rapidly from memory. When user applications require a greater amount of physical memory, the size of the disk cache is reduced. In this way physical memory is never left unused.

To facilitate debugging , the Linux kernel generates a core dump of a program that performs an illegal operation, such as accessing an invalid memory location. The core dump, which appears as a file called core in the directory that the program was running, allows the programmer to determine the cause of the crash. We talk about the use of core dumps for debugging in the section "Examining a Core File" in Chapter 21.

Commands and Shells

The most important utility to many users is the shell. The shell is a program that reads and executes commands from the user. In addition, many shells provide features such as job control (allowing the user to manage several running processes at once—not as Orwellian as it sounds), input and output redirection, and a command language for writing shell scripts. A shell script is a file containing a program in the shell command language, analogous to a "batch file" under Windows.

Many types of shells are available for Linux. The most important difference between shells is the command language. For example, the C shell (csh) uses a command language somewhat like the C programming language. The classic Bourne shell uses a different command language. One's choice of a shell is often based on the command language it provides. The shell that you use defines, to some extent, your working environment under Linux.

No matter what Unix shell you're accustomed to, some version of it has probably been ported to Linux. The most popular shell is the GNU Bourne Again Shell (bash), a Bourne shell variant. bash includes many advanced features, such as job control, command history, command and filename completion, an Emacs-like (or optionally, a vi-like) interface for editing the command line, and powerful extensions to the standard Bourne shell language. Another popular shell is tcsh, a version of the C shell with advanced functionality similar to that found in bash. Recently, zsh, with very advanced completion facilities, has found a lot of followers. Other shells include the Korn shell (ksh), BSD's ash, and rc, the Plan 9 shell.

What's so important about these basic utilities? Linux gives you the unique opportunity to tailor a custom system to your needs. For example, if you're the only person who uses your system, and you prefer to use the vi editor and the bash shell exclusively, there's no reason to install other editors or shells. The "do it yourself" attitude is prevalent among Linux hackers and users.

Text Processing and Word Processing

Almost every computer user has a need for some kind of document preparation system. (In fact, one of the authors has almost entirely forgotten how to write with pen and paper.) In the PC world, word processing is the norm: it involves editing and manipulating text (often in a "what you see is what you get" [WYSIWYG] environment) and producing printed copies of the text, complete with figures, tables, and other garnishes.

As you will see in this book, Linux supports attractive and full-featured WYSIWYG tools. In Chapter 8 we'll discuss OpenOffice (a free version of a propriety product, StarOffice, released by Sun Microsystems when it bought the suite's manufacturer), and KOffice, both of which are tightly integrated suites that support word processing, spreadsheets, and other common office tasks. These don't support all the features of Microsoft Office, but by the same token, they have some valuable features that Microsoft Office lacks. If you want to run Microsoft Office, you can do so through Wine, which we mention later.

There is a role for other ways to create documents, though. The system configuration files you need to edit on Linux from time to time, as well as programming for application development, require the use of simple text processing. The most popular tools for creating such documents are viand Emacs, described in detail in Chapter 19.

Text processing can also be used with separate formatting tools to create very readable and attractive documents. With a text processing system, the author enters text using a "typesetting language" that describes how the text should be formatted. Once the source text (in the typesetting language) is complete, a user formats the text with a separate program, which converts the source to a format suitable for printing. This is somewhat analogous to programming in a language such as C, and "compiling" the document into a printable form.

The most famous text formatting language is HTML, the markup language used by virtually every page on the World Wide Web. Another popular text processing language is DocBook XML, a kind of industry-standard set of tags for marking up technical documentation, which is also used by the Linux Documentation Project (to be discussed later in this chapter).

We'll look at several text formatting systems in Chapter 20, Text Processing: TEX (developed by Donald Knuth of computer science fame) and its dialect LATEX, groff, the GNU version of the classic troff text formatter originally developed by Bell Labs); Texinfo (an extension to TEX used for software documentation by the Free Software Foundation); and Docbook.

Commercial Applications

In addition to the more than fifteen hundred Linux applications maintained by Linux distributors such as Debian, a groundswell of support exists from commercial application developers for Linux. These products include office productivity suites, word processors, scientific applications, network administration utilities, ERP packages such as Oracle Financials and SAP, and large-scale database engines. Linux has become a major force in the commercial software market, so you may be surprised to find how many popular commercial applications are available for Linux. We can't possibly discuss all of them here, so we'll only touch on the most popular applications and briefly mention some of the others.

Oracle, IBM, Informix, Sybase, and Interbase have released commercial database engines for Linux. Many of the Linux database products have demonstrated better performance than their counterparts running on Windows servers.

One very popular database for Linux is MySQL , a free and easy-to-use database engine. Because MySQL is easy to install, configure, and use, it has rapidly become the database engine of choice for many applications that can forego the complexity of the various proprietary engines. Furthermore, even though it's free software, MySQL is supported professionally by the company that developed it, MySQL AB. We describe the basic use of MySQL in Chapter 25.

MySQL does not include some of the more advanced features of the proprietary databases, however. Some database users prefer the open source database PostgresSQL, and Red Hat features it in some of its products. On the other hand, MySQL is catching up really quickly; the next version will contain support for distributed databases, for example.

A wide range of enterprise applications is available for Linux in addition to databases. Linux is one of the most popular platforms for Internet service hosting, so it is appropriate that high-end platforms for scalable web sites, including JBoss, BEA WebLogic, and IBM WebSphere, have been released for Linux. Commercial, high-performance Java Virtual Machines and other software are available from Sun, IBM, and other vendors. IBM has released the popular Lotus Domino messaging and web application server, as well as the WebSphere MQ (formerly MQSeries) messaging platform.

Scientists, engineers, and mathematicians will find that a range of popular commercial products are available for Linux, such as Maple, Mathematica, MATLAB, and Simulink. Other commercial applications for Linux include high-end CAD systems, network management tools, firewalls, and software development environments.

Programming Languages and Utilities

Linux provides a complete Unix programming environment, including all the standard libraries, programming tools, compilers, and debuggers that you would expect to find on other Unix systems. The most commonly used compiler on Linux is the GNU's Compiler Collection, or gcc. gcc is capable of compiling C, C++, Objective C (another object-oriented dialect of C), Chill (a programming language mainly used for telecommunications), FORTRAN, and Java. Within the Unix software development world, applications and systems programming is usually done in C or C++, and gcc is one of the best C/C++ compilers around, supporting many advanced features and optimizations.

Java is an object-oriented programming language and runtime environment that supports a diverse range of applications such as web page applets, Internet-based distributed systems, database connectivity, and more. Java is fully supported under Linux. Several vendors and independent projects have released ports of the Java Development Kit for Linux, including Sun, IBM, and the Blackdown Project (which did one of the first ports of Java for Linux). Programs written for Java can be run on any system (regardless of CPU architecture or operating system) that supports the Java Virtual Machine. A number of Java "just in time" (or JIT ) compilers are available, and the IBM and Sun Java Development Kits (JDKs) for Linux come bundled with high-performance JIT compilers that perform as well as those found on Windows or other Unix systems.

Some of the most popular and interesting tools associated with Java are open source. These include Eclipse, an integrated development environment (IDE) that is extendable to almost anything through plugins; JBoss, an implementation of Java 2 Enterprise Edition (J2EE) that has actually gone through the expense of becoming certified after a complaint by Sun Microsystems; and Gluecode, another application platform company bought by IBM in May 2005.

gcc is also capable of compiling Java programs directly to executables, and includes limited support for the standard JDK libraries.

Besides C, C++, and Java, many other compiled and interpreted programming languages have been ported to Linux, such as Smalltalk, FORTRAN, Pascal, LISP, Scheme, and Ada. In addition, various assemblers for writing machine code are available. An important open source project sponsored by Novell has developed an environment called Mono that provides support for Microsoft's .NET environment on Unix and Linux systems. Perhaps the most important class of programming languages for Linux is the many scripting languages, including Perl (the script language to end all script languages), Python (the first scripting language to be designed as object-oriented from the ground up), and Ruby (a fiercely object-oriented scripting language that has been heralded as very good for rapid application development ).

Linux systems make use of the advanced gdb debugger, which allows you to step through a program to find bugs or examine the cause for a crash using a core dump. gprof, a profiling utility, will give you performance statistics for your program, letting you know where your program is spending most of its time. The Emacs and vim text editors provide interactive editing and compilation environments for various programming languages. Other tools that are available for Linux include the GNU make build utility, used to manage compilation of large applications , as well as source-code control systems such as CVS and Subversion.

Linux is an ideal system for developing Unix applications. It provides a modern programming environment with all the bells and whistles, and many professional Unix programmers claim that Linux is their favorite operating system for development and debugging. Computer science students can use Linux to learn Unix programming and to explore other aspects of the system, such as kernel architecture. With Linux, not only do you have access to the complete set of libraries and programming utilities, but you also have the complete kernel and library source code at your fingertips. Chapter 20 of this book is devoted to the programming languages and tools available for Linux.

The X Window System

The X Window System is the standard GUI for Unix systems. It was originally developed at MIT in the 1980s with the goal of allowing applications to run across a range of Unix workstations from different vendors. X is a powerful graphical environment supporting many applications. Many X-specific applications have been written, such as games, graphics utilities, programming and documentation tools, and so on.

Unlike Microsoft Windows, the X Window System has built-in support for networked applications: for example, you can run an X application on a server machine and have its windows display on your desktop, over the network. Also, X is extremely customizable: you can easily tailor just about any aspect of the system to your liking. You can adjust the fonts, colors, window decorations, and icons for your personal taste. You can go so far as to configure keyboard macros to run new applications at a keystroke. It's even possible for X to emulate the Windows and Macintosh desktop environments , if you want to keep a familiar interface.

The X Window System is freely distributable. However, many commercial vendors have distributed proprietary enhancements to the original X software. The version of X available for Linux is known as X.org , which is a port of X11R6 (X Window System Version 11, Release 6) made freely distributable for PC-based Unix systems, such as Linux.[*] X.org supports a wide range of video hardware, including standard VGA and many accelerated video adapters. X.org is a complete distribution of the X software, containing the X server itself, many applications and utilities, programming libraries, and documentation. It comes bundled with nearly every Linux distribution.

The look and feel of the X interface are controlled to a large extent by the window manager. This friendly program is in charge of the placement of windows, the user interface for resizing, iconifying, and moving windows, the appearance of window frames, and so on.

The X distribution and the major Linux distributions also contain programming libraries and include files for those wily programmers who wish to develop X applications. All the standard fonts, bitmaps, manual pages, and documentation are included.

Chapter 16 discusses how to install and use the X Window System on your Linux machine.

KDE and GNOME

Although the X Window System provides a flexible windowing system, many users want a complete desktop environment, with a customizable look and feel for all windows and widgets (such as buttons and scrollbars), a simplified user interface, and advanced features such as the ability to "drag and drop" data from one application to another. The KDE and GNOME projects are separate efforts that are striving to provide such an advanced desktop environment for Linux. By building up a powerful suite of development tools, libraries, and applications that are integrated into the desktop environment, KDE and GNOME aim to usher in the next era of Linux desktop computing. In the spirit of the open source community, these projects work together to provide complete interoperability so that applications originating in one environment will work on the other. Both systems provide a rich GUI, window manager, utilities, and applications that rival or exceed the features of systems such as the Windows XP desktop.

With KDE and GNOME, even casual users and beginners will feel right at home with Linux. Most distributions automatically configure one of these desktop environments during installation, making it unnecessary to ever touch the text-only console interface.

Both KDE and GNOME aim to make the Linux environment more user-friendly, and each has its fans and partisans. We discuss both in Chapter 3. As with X, both KDE and GNOME provide open source libraries that let you write programs conforming to their behavior and their look and feel.

Networking

Linux boasts one of the most powerful and robust networking systems in the world—more and more people are finding that Linux makes an excellent choice as a network server. Linux supports the TCP/IP networking protocol suite that drives the entire Internet, as well as many other protocols, including IPv6 (a new version of the IP protocol for the next-generation Internet), and UUCP (used for communication between Unix machines over serial lines). With Linux, you can communicate with any computer on the Internet, using Ethernet (including Fast and Gigabit Ethernet), Token Ring, dial-up connection, wireless network, packet radio, serial line, ADSL, ISDN, ATM, IRDA, AppleTalk, IPX (Novell NetWare), and many other network technologies. The full range of Internet-based applications is available, including World Wide Web browsers, web servers, FTP, email, chat, news, ssh, Telnet, and more.

Most Linux users use either a dial-up or a DSL connection through an ISP to connect to the Internet from home. Linux supports the popular PPP and SLIP protocols, used by most ISPs for dial-in access. If you have a broadband connection, such as a T1 line, cable modem, DSL, or other service, Linux supports those technologies as well. You can even configure a Linux machine to act as a router and firewall for an entire network of computers, all connecting to the Internet through a single dial-up or broadband connection.

Linux supports a wide range of web browsers, including Mozilla (the open source spin-off of the Netscape browser), Konquerer (another open source browser packaged with KDE), and the text-based Lynx browser. The Emacs text editor even includes a small text-based web browser.

Linux also hosts a range of web servers. Linux played an important role in the emergence of the popular and free Apache web server. In fact, it's estimated that Apache running on Linux systems drives more web sites than any other platform in the world. Apache is easy to set up and use; we show you how in Chapter 22.

A full range of mail and news readers is available for Linux, such as MH, Elm, Pine, and mutt, as well as the mail/news readers included with the Mozilla web browser. Many of these are compatible with standard mail and news protocols such as IMAP and POP. Whatever your preference, you can configure your Linux system to send and receive electronic mail and news from all over the world.

A variety of other network services are available for Linux. Samba is a package that allows Linux machines to act as a Windows file and print server. NFS allows your system to share files seamlessly with other machines on the network. With NFS, remote files look to you as if they were located on your own system's drives. FTP allows you to transfer files to and from other machines on the network. Other networking features include NNTP-based electronic news systems such as C News and INN; the Sendmail, Postfix, and Exim mail transfer agents; ssh, telnet, and rsh, which allow you to log in and execute commands on other machines on the network; and finger, which allows you to get information on other Internet users. There are tons of TCP/IP-based applications and protocols out there.

If you have experience with TCP/IP applications on other systems, Linux will be familiar to you. The system provides a standard socket programming interface, so virtually any program that uses TCP/IP can be ported to Linux. The Linux X server also supports TCP/IP, allowing you to display applications running on other systems on your Linux display. Administration of Linux networking will be familiar to those coming from other Unix systems, as the configuration and monitoring tools are similar to their BSD counterparts.

In Chapter 13, we discuss the configuration and setup of TCP/IP, including PPP, for Linux. We also discuss configuration of web browsers, web servers, and mail software.

Laptop Support

Linux includes a number of laptop-specific features, such as PCMCIA (or "PC Card") support and APM and the newer ACPI, as well as the wireless networking built into Centrino laptops. The PCMCIA Tools package for Linux includes drivers for many PCMCIA devices, including modems, Ethernet cards, and SCSI adapters. APM allows the kernel to keep track of the laptop's battery power and perform certain actions (such as an automated shutdown) when power is low; it also allows the CPU to go into "low-power" mode when not in use. This is easy to configure as a kernel option. Various tools interact with APM, such as apm (which displays information on battery status) and apmd (which logs battery status and can be used to trigger power events). These should be included with most Linux distributions. ACPI has a similar purpose, but is newer and more featureful. With ACPI, you can even use the so-called "suspend to disk" facility with it, where the current state of the computer is written to your hard disk, and the computer turned off. You can then turn it on later and resume your work exactly where you left off. GUI tools such askpowersave let you control this from a friendly graphical environment.

Interfacing with Windows

Various utilities exist to interface with the world of Windows and MS-DOS . The most well-known application is a project known as Wine—a platform for Microsoft Windows applications on the X Window System under Linux. Wine allows Microsoft Windows applications to run directly under Linux and other Intel-based operating systems. Wine is in a process of continual development, and now runs a wide variety of Windows software, including many desktop applications and games. We discuss Wine in Chapter 28.

Linux provides a seamless interface for transferring files between Linux and Windows systems. You can mount a Windows partition or floppy under Linux, and directly access Windows files as you would any others. In addition, there is the mtools package, which allows direct access to MS-DOS-formatted floppies, as well as htools , which does the same for Macintosh floppy disks.

Another legacy application is the Linux MS-DOS Emulator, or DOSEMU, which allows you to run many MS-DOS applications directly from Linux. Although MS-DOS-based applications are rapidly becoming a thing of the past, there are still a number of interesting MS-DOS tools and games that you might want to run under Linux. It's even possible to run the old Microsoft Windows 3.1 under DOSEMU.

Although Linux does not have complete support for emulating Windows and MS-DOS environments, you can easily run these other operating systems on the same machine with Linux, and choose which operating system to run when you boot the machine. Many distributions know how to preserve another operating system that's already installed when you add Linux to the computer, and set up a working LILO or GRUB bootloader to let you to select between Linux, Windows, and other operating systems at boot time. In this book we'll show you how to set up the LILO bootloader, in case you need to do it yourself.

Another popular option is to run a system-level virtual machine, which literally allows you to run Linux and Windows at the same time. A virtual machine is a software application that emulates many of the hardware features of your system, tricking the operating system into believing that it is running on a physical computer. Using a virtual machine, you can boot Linux and then run Windows at the same time—with both Linux and Windows applications on your desktop at once. Alternatively, you can boot Windows and run Linux under the virtual machine. Although there is some performance loss when using virtual machines, many people are very happy employing them for casual use, such as running a Windows-based word processor within a Linux desktop. The most popular virtual machines are VMware (http://www.vmware.com), which is a commercial product, and Bochs (http://bochs.sourceforge.net), which is an open source project. We describe VMware in Chapter 28.

Finally, remote logins allow you to work on another system from your Linux system. Any two computers running the X Window System (mostly Linux, BSD, and Unix systems) can share work this way, with a user on one system running a program on another, displaying the graphical output locally, and entering commands from the local keyboard and mouse. RDP, an acronym that has been expanded to both Remote Desktop Protocol and Remote Display Protocol, allows a Linux system to run programs on remote Windows systems in the same way. A Virtual Network Connection (VNC ) client and server perform the same task with even greater flexibility, letting different operating systems on different computers work together. In "Remote Desktop Access to Windows Programs" we show you how to set up these services, and in "FreeNX: Linux as a Remote Desktop Server" we discuss the FreeNX remote communication system, which allows the same transparent networking as X with a tremendous speed advantage. Both of these sections are in Chapter 28.

Other Applications

A host of miscellaneous applications are available for Linux, as one would expect from an operating system with such a diverse set of users. Linux's primary focus is currently for personal Unix computing, but this is rapidly changing. Business and scientific software are expanding, and commercial software vendors have contributed a growing pool of applications.

The scientific community has wholly embraced Linux as the platform of choice for inexpensive numerical computing. A large number of scientific applications have been developed for Linux, including the popular technical tools MATLAB and Mathematica. A wide range of free packages is also available, including FELT (a finite-element analysis tool), Spice (a circuit design and analysis tool), and Khoros (an image/digital signal processing and visualization system). Many popular numerical computing libraries have been ported to Linux, including the LAPACK linear algebra library. There is also a Linux-optimized version of the BLAS code upon which LAPACK depends.

Linux is one of the most popular platforms for parallel computing using clusters , which are collections of inexpensive machines usually connected with a fast (gigabit-per-second or faster) network. The NASA Beowulf project first popularized the idea of tying a large number of Linux-based PCs into a massive supercomputer for scientific and numerical computing. Today, Linux-based clusters are the rule, rather than the exception, for many scientific applications. In fact, Linux clusters are finding their way into increasingly diverse applications—for example, the Google search engine runs on a cluster of Linux machines (over 250,000 of them in December 2004, according to an MIT paper)!

As with any operating system, Linux has its share of games. A number of popular commercial games have been released for Linux, including Quake, Quake II, Quake III Arena, Doom, SimCity 3000, Descent, and more. Most of the popular games support play over the Internet or a local network, and clones of other commercial games are popping up for Linux. There are also classic text-based dungeon games such as Nethack and Moria; MUDs (multiuser dungeons, which allow many users to interact in a text-based adventure) such as DikuMUD and TinyMUD; and a slew of free graphical games, such as xtetris, netrek, and Xboard (the X11 frontend to gnuchess).

For audiophiles, Linux has support for a wide range of sound hardware and related software, such as CDplayer (a program that can control a CD-ROM drive as a conventional CD player, surprisingly enough), MIDI sequencers and editors (allowing you to compose music for playback through a synthesizer or other MIDI-controlled instrument), and sound editors for digitized sounds. You can play your MP3 and OGG/Vorbis files on Linux, and with the tools in some distributions you can handle more proprietary formats as well.

Can't find the application you're looking for? A number of web sites provide comprehensive directories of Linux applications. The best known is Freshmeat (http://www.freshmeat.net); a couple others are listed in Appendix A. Take a look at these sites just to see the enormous amount of code that has been developed for Linux.

If you absolutely can't find what you need, you can always attempt to port the application from another platform to Linux. Or, if all else fails, you can write the application yourself. That's the spirit of free software—if you want something to be done right, do it yourself! While it's sometimes daunting to start a major software project on your own, many people find that if they can release an early version of the software to the public, many helpers pop up in the free software community to carry on the project.


[*] On a 32-bit architecture; on a 64-bit architecture, up to 64 CPUs are supported, and patches are available that support up to 256 CPUs.

[*] If you are a real OS geek, you will note that swap space is inappropriately named: entire processes are not swapped, but rather individual pages of memory are paged out. Although in some cases entire processes will be swapped out, this is not generally the case. The term "swap space" originates from the early days of Linux and should technically be called "paging space."

[*] X.org actually derives from another PC-based version of the X Window System, XFree86. Political quarrels that we do not want to go into here have led to a split into XFree86 and X.org; most Linux distributions these days ship the X.org version. This is not relevant for you, though, unless you plan to help with the continued development of the X Window System.

About Linux's Copyright

Linux is covered by what is known as the GNU GPL. The GPL, which is sometimes referred to as a "copyleft" license, was developed for the GNU project by the Free Software Foundation. It makes a number of provisions for the distribution and modification of "free software." "Free," in this sense, refers to freedom, not just cost. The GPL has always been subject to misinterpretation, and we hope that this summary will help you to understand the extent and goals of the GPL and its effect on Linux. A complete copy of the GPL is available athttp://www.gnu.org/copyleft/gpl.html.

Originally, Linus Torvalds released Linux under a license more restrictive than the GPL, which allowed the software to be freely distributed and modified, but prevented any money changing hands for its distribution and use. The GPL allows people to sell and make profit from free software, but doesn't allow them to restrict the right for others to distribute the software in any way.

A Summary of Free Software Licensing

First, we should explain that "free software" covered by the GPL is not in the public domain. Public domain software is software that is not copyrighted and is literally owned by the public. Software covered by the GPL, on the other hand, is copyrighted to the author or authors. This means that the software is protected by standard international copyright laws and that the author of the software is legally defined. Just because the software may be freely distributed doesn't mean it is in the public domain.

GPL-licensed software is also not "shareware ." Generally, shareware software is owned and copyrighted by the author, but the author requires users to send in money for its use after distribution. On the other hand, software covered by the GPL may be distributed and used free of charge.

The GPL also allows people to take and modify free software, and distribute their own versions of the software. However, any derived works from GPL software must also be covered by the GPL. In other words, a company could not take Linux, modify it, and sell it under a restrictive license. If any software is derived from Linux, that software must be covered by the GPL as well.

People and organizations can distribute GPL software for a fee and can even make a profit from its sale and distribution. However, in selling GPL software, the distributor can't take those rights away from the purchaser; that is, if you purchase GPL software from some source, you may distribute the software for free or sell it yourself as well.

This might sound like a contradiction at first. Why sell software for profit when the GPL allows anyone to obtain it for free? When a company bundles a large amount of free software on a CD-ROM and distributes it, it needs to charge for the overhead of producing and distributing the CD-ROM, and it may even decide to make profits from the sale of the software. This is allowed by the GPL.

Organizations that sell free software must follow certain restrictions set forth in the GPL. First, they can't restrict the rights of users who purchase the software. This means that if you buy a CD-ROM of GPL software, you can copy and distribute that CD-ROM free of charge, or you can resell it yourself. Second, distributors must make it obvious to users that the software is indeed covered by the GPL. Third, distributors must provide, free of charge, the complete source code for the software being distributed, or they must point their customers on demand to where the software can be downloaded. This will allow anyone who purchases GPL software to make modifications to that software.

Allowing a company to distribute and sell free software is a very good thing. Not everyone has access to the Internet to download software, such as Linux, for free. The GPL allows companies to sell and distribute software to those people who do not have free (cost-wise) access to the software. For example, many organizations sell Linux on floppy, tape, or CD-ROM via mail order, and make a profit from these sales. The developers of Linux may never see any of this profit; that is the understanding that is reached between the developer and the distributor when software is licensed by the GPL. In other words, Linus knew that companies might wish to sell Linux and that he might not see a penny of the profits from those sales. (If Linus isn't rich, at least he's famous!)

In the free-software world, the important issue is not money. The goal of free software is always to develop and distribute fantastic software and to allow anyone to obtain and use it. In the next section, we'll discuss how this applies to the development of Linux.

SCO and Other Challenges

In March 2003, a company called SCO—which had a tortuous history of mergers and divestitures that involved purchasing some rights to Unix—claimed that Linux contained some source code to which SCO had rights, and therefore that SCO had rights to Linux as well. The company started by suing IBM, a bold choice (to say the least) because few companies in the computer field could be more familiar with litigation or be better prepared for it. In any case, SCO made it clear that their complaints went far beyond IBM; indeed, that they were owed something by anyone using Linux. In December 2003, according to news reports, SCO even sent letters to a large number of Fortune 1000 companies advising them to send licensing fees to SCO.

Red Hat and other companies joined the fray. Novell, which by then had bought SUSE and become a solid member of the Linux community, added some zest to the already indigestible controversy by citing its own rights to Unix. Over time the whole affair became a tangle of lawsuits, countersuits, motions to dismiss, public relations grand-standing, and general mud-slinging.

As of this writing, the SCO case is unresolved, but the results seem salutory. Few observers believe Linux is in trouble; rather, it is SCO that is financially threatened. The network of companies, individuals, and key organizations that support Linux has handled the challenge well. Some major vendors strengthened their support for Linux by offering their customers indemnification. The next edition of this book, we hope, will contain little more than a footnote about the whole affair.

Finally, Linus Torvalds and the OSDL have recognized that the old method of accepting code with no strings attached should be tightened. Starting in May 2004, anyone submitting code to the kernel has been asked to include their contact information and to declare informally that they have a right to the code they are submitting. The new system is lightweight and simple, but allows challenges (of which none have been received yet) to be tracked back to the people responsible for the code in question.

Further copyright challenges to Linux are unlikely; patents, however, could be used against it. But every programmer and software company has to worry about software patents; Linux and free software are no more at risk than any other software. Although the workings of free software are entirely open to inspection, and therefore might be more tempting to target with a patent lawsuit, the only purpose of such a lawsuit would be to maliciously shut down a project, because free software cannot support license fees.

Open Source and the Philosophy of Linux

When new users encounter Linux, they often have a few misconceptions and false expectations of the system. Linux is a unique operating system, and it's important to understand its philosophy and design in order to use it effectively. At the center of the Linux philosophy is a concept that we now call open source software.

Open source is a term that applies to software for which the source code—the inner workings of the program—is freely available for anyone to download, modify, and redistribute. Software covered under the GNU GPL, described in the previous section, fits into the category of open source. Not surprisingly, though, so does software that uses copyright licenses that are similar, but not identical, to the GPL. For example, software that can be freely modified but that does not have the same strict requirements for redistribution as the GPL is also considered open source. Various licenses fit this category, including the BSD License and the Apache Software License .

The so-called open source and free software development models started with the Free Software Foundation and were popularized with Linux. They represent a totally different way of producing software that opens up every aspect of development, debugging, testing, and study to anyone with enough interest in doing so. Rather than relying upon a single corporation to develop and maintain a piece of software, open source allows the code to evolve, openly, in a community of developers and users who are motivated by a desire to create good software, rather than simply to make a profit.

O'Reilly has published two books, Open Sources 1.0 and Open Sources 2.0, that serve as good introductions to the open source development model. They're collections of essays about the open source process by leading developers (including Linus Torvalds and Richard Stallman). Another popular text on this topic—so often cited that it is considered nearly canonical—is The Cathedral and the Bazaar, by Eric S. Raymond.

Open source has received a lot of media attention, and some are calling the phenomenon the next wave in software development, which will sweep the old way of doing things under the carpet. It still remains to be seen whether that will happen, but there have been some encouraging events that make this outcome seem likely. For example, Netscape Corporation has released the code for its web browser as an open source project called Mozilla, and companies such as Sun Microsystems, IBM, and Apple have released certain products as open source in the hopes that they will flourish in a community-driven software development effort.

Open source has received a lot of media attention, and Linux is at the center of all of it. In order to understand where the Linux development mentality is coming from, however, it might make sense to take a look at how commercial software has traditionally been built.

Commercial software houses tend to base development on a rigorous policy of quality assurance , source and revision control systems, documentation, and bug reporting and resolution. Developers are not allowed to add features or to change key sections of code on a whim: they must validate the change as a response to a bug report and consequently "check in" all changes to the source control system so that the changes can be backed out if necessary. Each developer is assigned one or more parts of the system code, and only that developer may alter those sections of the code while it is "checked out."

Internally, the quality assurance department runs rigorous test suites (so-called regression tests) on each new pass of the operating system and reports any bugs. It's the responsibility of the developers to fix these bugs as reported. A complicated system of statistical analysis is employed to ensure that a certain percentage of bugs are fixed before the next release, and that the system as a whole passes certain release criteria.

In all, the process used by commercial software developers to maintain and support their code is very complicated, and quite reasonably so. The company must have quantitative proof that the next revision of the software is ready to be shipped. It's a big job to develop a commercial software system, often large enough to employ hundreds (if not thousands) of programmers, testers, documenters, and administrative personnel. Of course, no two commercial software vendors are alike, but you get the general picture. Smaller software houses, such as startups, tend to employ a scaled-down version of this style of development.

On the opposite end of the spectrum sits Linux, which is, and more than likely always will be, a hacker's operating system.[*] Although many open source projects have adopted elements of commercial software development techniques, such as source control and bug tracking systems, the collaborative and distributed nature of Linux's development is a radical departure from the traditional approach.

Recently, there has been a lot of talk about so-called agile development practices like XP (extreme programming) . Linux and open source adepts are often a bit surprised about this, since these "lightweight" software development methods have always been a central idea of open source development.

Linux is primarily developed as a group effort by volunteers on the Internet from all over the world. No single organization is responsible for developing the system. For the most part, the Linux community communicates via various mailing lists and web sites. A number of conventions have sprung up around the development effort: for example, programmers wanting to have their code included in the "official" kernel should mail it to Linus Torvalds. He will test the code and include it in the kernel (as long as it doesn't break things or go against the overall design of the system, he will more than likely include it). As Linux has grown, this job has become too large for Linus to do himself (plus, he has kids now), so other volunteers are responsible for testing and integrating code into certain aspects of the kernel, such as the network subsystem.

The system itself is designed with a very open-ended, feature-rich approach. A new version of the Linux kernel will typically be released about every few weeks (sometimes even more frequently than this). Of course, this is a very rough figure; it depends on several factors, including the number of bugs to be fixed, the amount of feedback from users testing prerelease versions of the code, and the amount of sleep that Linus has had that week.

Suffice it to say that not every single bug has been fixed and not every problem ironed out between releases. (Of course, this is always true of commercial software as well!) As long as the system appears to be free of critical or oft-manifesting bugs, it's considered "stable" and new revisions are released. The thrust behind Linux development is not an effort to release perfect, bug-free code; it's to develop a free implementation of Unix. Linux is for the developers, more than anyone else.

Anyone who has a new feature or software application to add to the system generally makes it available in an "alpha" stage—that is, a stage for testing by those brave users who want to bash out problems with the initial code. Because the Linux community is largely based on the Internet, alpha software is usually uploaded to one or more of the various Linux web sites (see the Appendix), and a message is posted to one of the Linux mailing lists about how to get and test the code. Users who download and test alpha software can then mail results, bug fixes, or questions to the author.

After the initial problems in the alpha code have been fixed, the code enters a "beta" stage, in which it's usually considered stable but not complete (that is, it works, but not all the features may be present). Otherwise, it may go directly to a "final" stage in which the software is considered complete and usable. For kernel code, once it's complete, the developer may ask Linus to include it in the standard kernel, or as an optional add-on feature to the kernel.

Keep in mind these are only conventions , not rules. Some people feel so confident with their software that they don't need to release an alpha or test version. It's always up to the developer to make these decisions.

What happened to regression testing and the rigorous quality process? It's been replaced by the philosophy of "release early and often." Real users are the best testers because they try out the software in a variety of environments and in a host of demanding real-life applications that can't be easily duplicated by any software quality assurance group. One of the best features of this development and release model is that bugs (and security flaws) are often found, reported, and fixed within hours, not days or weeks.

You might be amazed that such an unstructured system of volunteers programming and debugging a complete Unix system could get anything done at all. As it turns out, it's one of the most efficient and motivated development efforts ever employed. The entire Linux kernel was written from scratch, without employing any code from proprietary sources. A great deal of work was put forth by volunteers to port all the free software under the sun to the Linux system. Libraries were written and ported, filesystems developed, and hardware drivers written for many popular devices.

The Linux software is generally released as a distribution, which is a set of prepackaged software making up an entire system. It would be quite difficult for most users to build a complete system from the ground up, starting with the kernel, then adding utilities, and installing all necessary software by hand. Instead, there are a number of software distributions including everything you need to install and run a complete system. Again, there is no standard distribution; there are many, each with its own advantages and disadvantages. In this book, we describe how to install the Red Hat, SUSE, and Debian distributions, but this book can help you with any distribution you choose.

Despite the completeness of the Linux software, you still need a bit of Unix know-how to install and run a complete system. No distribution of Linux is completely bug-free, so you may be required to fix small problems by hand after installation. Although some readers might consider this a pain, a better way to think about it is as the "joy of Linux"--that of having fun tinkering with, learning about, and fixing up your own system. It's this very attitude that distinguishes Linux enthusiasts from mere users. Linux can be either a hobby, an adventure sport, or a lifestyle. (Just like snowboarding and mountain biking, Linux geeks have their own lingo and style of dress—if you don't believe us, hang out at any Linux trade show!) Many new Linux users report having a great time learning about this new system, and find that Linux rekindles the fascination they had when first starting to experiment with computers.

Hints for Unix Novices

Installing and using your own Linux system doesn't require a great deal of background in Unix. In fact, many Unix novices successfully install Linux on their systems. This is a worthwhile learning experience, but keep in mind that it can be very frustrating to some. If you're lucky, you will be able to install and start using your Linux system without any Unix background. However, once you are ready to delve into the more complex tasks of running Linux—installing new software, recompiling the kernel, and so forth—having background knowledge in Unix is going to be a necessity. (Note, however, that many distributions of Linux are as easy to install and configure as Windows and certainly easier than Windows 2000 or XP.)

Fortunately, by running your own Linux system, you will be able to learn the essentials of Unix necessary to perform these tasks. This book contains a good deal of information to help you get started. Chapter 4 is a tutorial covering Unix basics, and Part II contains information on Linux system administration. You may wish to read these chapters before you attempt to install Linux at all; the information contained therein will prove to be invaluable should you run into problems.

Just remember that nobody can expect to go from being a Unix novice to a Unix system administrator overnight. A powerful and flexible computer system is never maintenance-free, so you will undoubtedly encounter hang-ups along the way. Treat this as an opportunity to learn more about Linux and Unix, and try not to get discouraged when things don't always go as expected!

Hints for Unix Gurus

Even those people with years of Unix programming and system administration experience may need assistance before they are able to pick up and install Linux. There are still aspects of the system Unix wizards need to be familiar with before diving in. For one thing, Linux is not a commercial Unix system. It doesn't attempt to uphold the same standards as other Unix systems you may have come across. But in some sense, Linux is redefining the Unix world by giving all other systems a run for their money. To be more specific, while stability is an important factor in the development of Linux, it's not the only factor.

More important, perhaps, is functionality. In many cases, new code will make it into the standard kernel even though it's still buggy and not functionally complete. The assumption is that it's more important to release code that users can test and use than delay a release until it's "complete." Nearly all open source software projects have an alpha release before they are completely tested. In this way, the open source community at large has a chance to work with the code, test it, and develop it further, while those who find the alpha code "good enough" for their needs can use it. Commercial Unix vendors rarely, if ever, release software in this manner.

Even if you're a Unix ultra-wizard who can disassemble Solaris kernels in your sleep and recode an AIX superblock with one hand tied behind your back, Linux might take some getting used to. The system is very modern and dynamic, with a new kernel release approximately every few months and new utilities constantly being released. One day your system may be completely up to date with the current trend, and the next day the same system is considered to be in the Stone Age.

With all of this dynamic activity, how can you expect to keep up with the ever-changing Linux world? For the most part, it's best to upgrade incrementally; that is, upgrade only those parts of the system that need upgrading, and then only when you think an upgrade is necessary. For example, if you never use Emacs, there is little reason to continuously install every new release of Emacs on your system. Furthermore, even if you are an avid Emacs user, there is usually no reason to upgrade it unless you find that a missing feature is in the next release. There is little or no reason to always be on top of the newest version of software.

Keep in mind that Linux was developed by its users. This means, for the most part, that the hardware supported by Linux is that which users and developers actually have access to. As it turns out, most of the popular hardware and peripherals for 80x86 systems are supported (in fact, Linux probably supports more hardware than any commercial implementation of Unix). However, some of the more obscure and esoteric devices, as well as those with proprietary drivers for which the manufacturers do not easily make the specifications available, aren't supported yet. As time goes on, a wider range of hardware will be supported, so if your favorite devices aren't listed here, chances are that support for them is forthcoming.

Another drawback for hardware support under Linux is that many companies have decided to keep the hardware interface proprietary. The upshot of this is that volunteer Linux developers simply can't write drivers for those devices (if they could, those drivers would be owned by the company that owned the interface, which would violate the GPL). The companies that maintain proprietary interfaces write their own drivers for operating systems, such as Microsoft Windows; the end user (that's you) never needs to know about the interface. Unfortunately, this does not allow Linux developers to write drivers for those devices.

Little can be done about the situation. In some cases, programmers have attempted to write hackish drivers based on assumptions about the interface. In other cases, developers work with the company in question and attempt to obtain information about the device interface, with varying degrees of success.


[*] Our definition of "hacker" is a feverishly dedicated programmer—a person who enjoys exploiting computers and generally doing interesting things with them. This is in contrast to the common connotation of "hacker" as a computer wrongdoer or an outlaw.

Sources of Linux Information

As you have probably guessed, many sources of information about Linux are available, apart from this book.

Online Documents

If you have access to the Internet, you can get many Linux documents via web and anonymous FTP sites all over the world. If you do not have direct Internet access, these documents may still be available to you; many Linux distributions on CD-ROM contain all the documents mentioned here and are often available off the retail shelf.

A great number of web and FTP archive sites carry Linux software and related documents. Appendix A contains a listing of some of the Linux documents available via the Internet.

Examples of available online documents are the Linux FAQ, a collection of frequently asked questions about Linux; the Linux HOWTO documents, each describing a specific aspect of the system—including the Installation HOWTO, the Printing HOWTO, and the Ethernet HOWTO; and the Linux META-FAQ, a list of other sources of Linux information on the Internet.

Additional documentation , individually hosted "HOWTOs," blogs, knowledge bases, and forums exist that provide significant material to help individuals use Linux. Distributors maintain diverse mailing lists and forums dealing with a variety of subjects from using Linux on a laptop to configuring web servers. Such web sites and digests of mailing lists have largely taken over for Linux-related Usenet newsgroups; see "Usenet Newsgroups" later in this chapter.

The central Linux Documentation home page is available to web users at http://www.tldp.org. This page contains many HOWTOs and other documents, as well as pointers to other sites of interest to Linux users, including the Linux Documentation Project manuals (see the following section).

Books and Other Published Works

There are a number of published works specifically about Linux. In addition, a number of free books are distributed on the Internet by the Linux Documentation Project (LDP), a project carried out over the Internet to write and distribute a bona fide set of "manuals" for Linux. These manuals are analogs to the documentation sets available with commercial versions of Unix: they cover everything from installing Linux to using and running the system, programming, networking, kernel development, and more.

The LDP manuals are available via the Web, as well as via mail order from several sources. O'Reilly has published the Linux Network Administrator's Guide from the LDP.

Aside from the growing number of Linux books, books about Unix still exist (though many have ceased publication). In general, these books are equally applicable to Linux. So far as using and programming the system is concerned, simpler Linux tasks don't differ greatly from original implementations of Unix in many respects. Armed with this book and some other Linux or Unix books on specialized topics, you should be able to tackle a majority of Linux tasks.

There are monthly magazines about Linux, notably Linux Journal and Linux Magazine. These are an excellent way to keep in touch with the many goings-on in the Linux community. Languages other than English have their own Linux print publications as well. (European, South American, and Asian publications have become commonplace in the last few years.)

Usenet Newsgroups

Usenet is a worldwide electronic news and discussion forum with a heavy contingent of so-called newsgroups , or discussion areas devoted to a particular topic. Much of the development of Linux has been done over the waves of the Internet and Usenet, and not surprisingly, a number of Usenet newsgroups are available for discussions about Linux.

There are far too many newsgroups devoted to Linux to list here. The ones dealing directly with Linux are under the comp.os.linux hierarchy, and you'll find others on related topics such as comp.windows.x.

Internet Mailing Lists

If you have access to Internet electronic mail, you can participate in a number of mailing lists devoted to Linux. These run the gamut from kernel hacking to basic user questions. Many of the popular Linux mailing lists have associated web sites with searchable archives, allowing you to easily find answers to common questions. We list some of these resources in the Appendix.

Getting Help

First, we should mention that Linux has a rich community of volunteers and participants who need help and offer help for free. A good example of such a community is Ubuntu (http://www.ubuntulinux.org). Supported by a commercial company, Canonical Ltd., that offers low-cost professional support, Ubuntu has a large and enthusiastic community ready to provide old-style Linux support. Ubuntu, a derivative of Debian, employs a number of paid developers who also help maintain the Debian project.

Distributions such as Red Hat, Novell's SUSE, and Mandriva have become quite adept at providing commercial support for their own distributions of Linux and for other open source projects. Following a concept originated by Bernard Golden called the Open Source Maturity Model, Linux companies have done an excellent job in demonstrating their ability to compete using the open source paradigm. They have demonstated the ability to provide:

§ Adequate support and maintenance

§ Continued innovation

§ Product road maps and commitments to adhere to them

§ Functionality and ease of use for IT managers, particularly across enterprise-size environments

§ Stable business models to fund new development and expand into new product areas

§ Structured and scalable partner ecosystems devoted to enabling customer success

Additionally, these Linux companies have established community projects to keep them from becoming stale.

Mature Linux companies also provide extended business offerings, including training, professional sales and support (24 × 7 × 365), indemnification, and quality documentation.

In addition to the companies already mentioned, you will find a channel full of their business partners who have considerable expertise in providing commercial Linux support. Their web sites contain ways to find a business partner that can assist Linux users in a variety of ways.

As you become more accustomed to running Linux, you will probably discover many facets that may pleasantly surprise you. Many people not only use Linux but consider the community their home base. Good luck in the coming days.