The Ubuntu Server - The Official Ubuntu Book (2011)

The Official Ubuntu Book (2011)

Chapter 5. The Ubuntu Server

What Is Ubuntu Server?

Installing Ubuntu Server

Ubuntu Package Management

Ubuntu Server Security

Advanced Topics

Summary

Ubuntu 4.10, lovingly known as warty warthog, was the first public version of Ubuntu. Its installation media provided no obvious way to install the bare-bones OS without a full desktop environment. The system administrator crowds, easily irritable and feisty by nature, were greatly annoyed: They proclaimed Ubuntu was just a desktop distribution and sauntered back to their caves in contempt.

The next release of Ubuntu that came out, Hoary Hedgehog, rectified the problem and allowed for trivial installation of a minimal Ubuntu version suitable for servers. Yet the myth of Ubuntu as a purely desktop-oriented distribution stuck.

Luckily, the sentiment is just that—a myth. Ubuntu is a world-class server platform today, providing everything you’d expect from a server OS and with the human flavor that makes Ubuntu different. The dedicated hackers on the Ubuntu Server Team tend to the minutiae of hardware support and testing, mercilessly beat on the latest version of server software to make sure it’s up to snuff for inclusion in the distribution, look for ways to push innovation into the server field, and are available to users like you to field feedback, questions, and cries of anguish.

That said, setting up a server is no small task. Server administrators constantly deal with complex issues such as system security, fault tolerance, and data safety, and while Ubuntu makes these issues more pleasant to deal with, they’re not to be taken lightly. The aim of this chapter is thus not to teach you how to be a system administrator—we could easily fill a dozen books attempting to do that—but to give you a quick crash course. We’ll also highlight the specific details that set Ubuntu Server apart from other server platforms, offer tips on some of the most common server uses, and give you pointers on where to find other relevant information.

What Is Ubuntu Server?

By far the most common reaction from users first encountering Ubuntu Server is one of utter and hopeless confusion. People are foggy on whether Ubuntu Server is a whole new distribution or an Ubuntu derivative like Kubuntu (only for servers) or perhaps something else entirely.

Let’s clear things up a bit. The primary software store for Ubuntu and official derivatives is called the Ubuntu archive. The archive is merely a collection of software packages in Debian “deb” format, and it contains every single package that makes up distributions such as Ubuntu, Edubuntu, Xubuntu, Kubuntu, and Ubuntu Server. What makes Kubuntu separate from Ubuntu, then, is only the set of packages from the archive that its installer installs by default and that its CDs carry.

Ubuntu Server is no different. It depends on the very same archive as the standard Ubuntu distribution, but it installs a distinctive set of default packages. Notably, the set of packages comprising Ubuntu Server is very small. The installer will not install things such as a graphical environment or many user programs by default. But since all the packages for Ubuntu Server come from the same official Ubuntu archive, you can install any package you like later. In theory, there’s nothing stopping you from transforming an Ubuntu Server install into a regular Ubuntu desktop installation or vice versa (in practice, this is tricky, and we don’t recommend you try it). You can even go from running Kubuntu to running Ubuntu Server. The archive paradigm gives you maximum flexibility.

We’ve established that Ubuntu Server just provides a different set of default packages than Ubuntu. But what’s important about that different set? What makes Ubuntu Server a server platform?

The most significant difference is a custom server kernel. This kernel employs an internal timer frequency of 100Hz instead of the desktop default of 250Hz, uses the deadline I/O scheduler instead of the desktop’s CFQ scheduler, and contains a batch of other minor tweaks for virtualization, memory support, and routing. We’ll spare you the OS theory: The idea is to offer some extra performance and throughput for server applications. In addition, the server kernel supports basic NUMA, a memory design used in some multiprocessor systems that can dramatically increase multiprocessing performance.

So what else is different in Ubuntu Server? Other than the server kernel and a minimal set of packages, not too much. Though Ubuntu has supported a minimal installation mode for a number of releases, spinning off Ubuntu Server into a separate product that truly stands on its own is still a young effort, but one that’s moving along very quickly.

Starting with Ubuntu Server 6.06 LTS, known as Dapper Drake, Ubuntu Server offers officially supported packages for the Red Hat Cluster Suite, Red Hat’s Global File System (GFS), Oracle’s OCFS2 filesystem, and the Linux Virtual Server utilities: keepalived and ipvsadm. Combined with the specialized server kernel, these bits already let you use your Ubuntu Server for some heavy lifting. And there’s a growing lineup of compelling features, including built-in virtualization, interoperability with Windows machines on the network through Samba, automatic version control for configuration files, support for LDAP directory services, hard drive replication over the network, and even a healthy dose of the latest buzzword—cloud computing.

Installing Ubuntu Server

So you’ve downloaded your Ubuntu Server CD from http://releases.ubuntu.com/11.04/ and burned it, eagerly placed it in your CD drive, and rebooted the machine to be greeted by the friendly Ubuntu menu. The first option, Install Ubuntu Server, marks the beginning of a journey toward your very own system administrator cave.

Until recently, the process of installing Ubuntu Server was identical to installing a desktop. Both installations were performed with a textual installer, a charmingly quaint combination of red and blue screens with text all over. Since then, the desktop version’s installer has been replaced by a beautiful graphical environment that lets you play with a fully usable Ubuntu setup right off the install CD. But the Server CD retained its red and blue colors; because the textual installer doesn’t rely on automatically detecting finicky graphics cards, it’s just about certain to work on most any piece of hardware you can get your hands on. And when you’re installing a server, that’s worth more than all the eye candy in the world.

Here, we look at some of the advanced textual installer gadgetry that is particularly geared toward server users.

The neat stuff begins when you arrive at the partitioning section of the installer. With a desktop machine, you’d probably let the installer configure a basic set of partitions by itself and go on its merry way. But with servers, things get a bit more complicated.

A Couple of Installer Tricks

As we’ll explore below, in terms of partitioning and storage, server installations can be quite a bit more complex than desktop ones. There’s a small bag of useful tricks with the installer that can help when things get hairy.

The installer itself runs on virtual console 1. If you switch to console 2 by pressing Alt-F2, you’ll be able to activate the console by hitting Enter and land in a minimalistic (busybox) shell. This will let you explore the complete installer environment and take some matters into your own hands if necessary. You can switch back to the installer console by pressing Alt-F1. Console 4 contains a running, noninteractive log file of the installation, which you can inspect by pressing Alt-F4. Finally, it’s sometimes useful to be able to connect to another server during installation, perhaps to upload a log file or to gain access to your mailbox or other communication. By default, the shell on console 2 will not provide you with an ssh client, but you can install one by running anna-install openssh-client-udeb after the installer has configured the network. Now you can use the ssh and scp binaries to log in or copy data to the server of your choice.

Partitioning Your Ubuntu Server

Deciding how to partition the storage in your server is a tricky affair and certainly no exact science. Generally, it’s a good idea to have at least three partitions separate from the rest of the system:

• /home: where all the user files will live

• /tmp: temporary scratch space for running applications

• /var: mail spools and log files


Tip

Partition Security and Separating Logs and Spools

There are several options that you can turn on for specific system partitions that afford you extra security. We’ll explain them later in this chapter, in the section dealing with security.

As an aside, if your server will keep extensive mail and news spools, you might want to further separate /var into partitions for /var/log and /var/spool. Having them both on the same partition might cause severe I/O congestion under heavy use.


Keeping data on separate partitions gives you, the administrator, an expansive choice of filesystems you use for particular purposes. For instance, you might choose to put /tmp on ReiserFS for its superior handling of many files in a directory and excellent performance on small files, but you might keep /home and /var on ext3 for its rock-solid robustness or on the new default ext4 filesystem as a good compromise between the two.

In addition, a dedicated /home partition lets you use special options when mounting it to your system, such as imposing disk space quotas or enabling extended security on user data. The reason to keep /tmp and /var separate from the rest of your system is much more prosaic: These directories are prone to filling up. This is the case with /tmp because it’s a scratchpad, and administrators often give users very liberal quotas there (but have a policy, for example, of purging all user data in /tmp older than two days), which means /tmp can easily get clogged up. /var, on the other hand, stores log files and mail spools, both of which can take up massive amounts of disk space either as a result of malicious activity or due to a significant spike in normal system usage.

Becoming a system administrator means you have to learn how to think like one. If /tmp and /var are easy to fill up, you compartmentalize them so that they can’t eventually consume all the disk space available on your server.

The Story of RAID

If you’ve got only one hard drive in your server, feel free to skip ahead. Otherwise, let’s talk about putting those extra drives to use. The acronym RAID stands for redundant array of inexpensive disks, although if you’re a businessperson, you can substitute the word independent for inexpensive. We forgive you. And if you’re in France, RAID is short for recherche assistance intervention dissuasion, which is an elite commando unit of the National Police—but if that’s the RAID you need help with, you’re reading the wrong book. We think RAID is just a really awesome idea for data: When dealing with your information, it provides extra speed, fault tolerance, or both.

At its core, RAID is just a way to replicate the same information across multiple physical drives. The process can be set up in a number of ways, and specific kinds of drive configurations are referred to as RAID levels. These days, even low- to mid-range servers ship with integrated hardware RAID controllers, which operate without any support from the OS. If your new server doesn’t come with a RAID controller, you can use the software RAID functionality in the Ubuntu kernel to accomplish the same goal.

Setting up software RAID while installing your Linux system was difficult and unwieldy only a short while ago, but it is a breeze these days: The Ubuntu installer provides a nice, convenient interface for it and then handles all the requisite backstage magic. You can choose from three RAID levels: 0, 1, and 5.

RAID 0

A so-called striped set, RAID 0 allows you to pool the storage space of a number of separate drives into one large, virtual drive. The important thing to keep in mind is that RAID 0 does not actually concatenate the physical drives—it actually spreads the data across them evenly, which means that no more space will be used on each physical drive than can fit on the smallest one. In practical terms, if you had two 250GB drives and a 200GB drive, the total amount of space on your virtual drive would equal 600GB; 50GB on each of the two larger drives would go unused. Spreading data in this fashion provides amazing performance but also significantly decreases reliability. If any of the drives in your RAID 0 array fail, the entire array will come crashing down, taking your data with it.

RAID 1

This level provides very straightforward data replication. It will take the contents of one physical drive and multiplex it to as many other drives as you’d like. A RAID 1 array does not grow in size with the addition of extra drives—instead, it grows in reliability and read performance. The size of the entire array is limited by the size of its smallest constituent drive.

RAID 5

When the chief goal of your storage is fault tolerance, and you want to use more space than provided by the single physical drive in RAID 1, this is the level you want to use. RAID 5 lets you use n identically sized physical drives (if different-sized drives are present, no more space than the size of the smallest one will be used on each drive) to construct an array whose total available space is that of n–1 drives, and the array tolerates the failure of any one—but no more than one—drive without data loss.


Tip

The Mythical Parity Drive

If you toss five 200GB drives into a RAID 5 array, the array’s total usable size will be 800GB, or that of four drives. This makes it easy to mistakenly believe that a RAID 5 array “sacrifices” one of the drives for maintaining redundancy and parity, but this is not the case. Through some neat mathematics of polynomial coefficients over Galois fields, the actual parity information is striped across all drives equally, allowing any single drive to fail without compromising the data. Don’t worry, though. We won’t quiz you on the math.


Which RAID to Choose?

If you’re indecisive by nature, the past few paragraphs may have left you awkwardly hunched in your chair, mercilessly chewing a No. 2 pencil, feet tapping the floor nervously. Luckily, the initial choice of RAID level is often a no-brainer, so you’ll have to direct your indecision elsewhere. If you have one hard drive, no RAID for you. Do not pass Go, do not collect $200. Two drives? Toss them into RAID 1, and sleep better at night. Three or more? RAID 5. Unless you really know what you’re doing, avoid RAID 0 like the plague. If you’re not serving mostly read-only data without a care about redundancy, RAID 0 isn’t what you want.


Tip

Other RAID Modes

Though the installer offers only the most common RAID modes—0, 1, and 5—many other RAID modes exist and can be configured after the installation. Take a look at http://en.wikipedia.org/wiki/RAID for a detailed explanation of all the modes.


Setting Up RAID

After carefully studying the last section, maybe reading a few books on abstract algebra and another few on finite field theory, you finally decided on a RAID level that suits you. Since books can’t yet read your mind, we’ll assume you chose RAID 1. So how do you set it up?

Back to the installer. When prompted about partitioning disks, you’ll want to bravely select the last option, Manually Edit Partition Table.

Below the top two options on the screen (Guided Partitioning and Help), you’ll find a list of the physical drives in your server that the Ubuntu installer detected.


Tip

Avoiding the “Oh, No!” Moment

We’ve said this before, and we’ll say it again: It’s very easy to mistakenly erase valuable data when partitioning your system. Since you’re installing a server, however, we’ll assume you’re comfortable deleting any data that might already exist on the drives. If this is not the case, back up all data you care about now! We mean it.


Indented below each drive, you’ll find the list of any preexisting partitions, along with their on-disk ordinal number, size, bootable status, filesystem type, and, possibly, their mount point. Using the arrow keys, highlight the line summarizing a physical drive (not any of its partitions), and hit Enter—you’ll be asked to confirm replacing any existing partition table with a new one. Select Yes, and the only entry listed below that drive will be FREE SPACE. In our fictional server, we have two 80GB drives—hda and hdb—so we’d follow this process for both drives, giving each a fresh partition table. Say we’ve decided on a 20GB /home partition. Arrow over to FREE SPACE, hit Enter, and create the partition. Once you’ve entered the size for the new partition, you’ll be brought to a dialog where you can choose the filesystem and mount options. Instead of plopping a filesystem on the raw partition, however, you’ll want to enter the Use As dialog and set the new partition to be a physical volume for RAID.

Still with us? Now rinse and repeat for the other drive—create the exact same partition, same size, and set it as a RAID volume. When you’re done, you should be back at the initial partitioning screen, and you should have an identically sized partition under each drive. At this point, choose Configure Software RAID at the top of the screen, agree to write out changes to the storage devices if need be, and then choose to create an MD (multidisk) device. After selecting RAID 1, you’ll be asked to enter the number of active devices for the array. In our fictional two-drive server, it’s two. The next question concerns the number of spare devices in the array, which you can leave at zero. Now simply use the spacebar to put a check next to both partitions that you’ve created (hda1 and hdb1), and hit Finish in the Multidisk dialog to return to the basic partitioner.

If you look below the two physical drives that you used to have there, you’ll notice a brand new drive, the Software RAID device that has one partition below it. That’s your future /home partition, sitting happily on a RAID array. If you arrow over to it and hit Enter, you can now configure it just as you would a real partition.

The process is the same for any other partitions you want to toss into RAID. Create identical-sized partitions on all participating physical drives, select to use them as RAID space, enter the multidisk configurator (software RAID), and finally, create an array that uses the real partitions. Then create a filesystem on the newly created array.


Tip

Array Failure and Spare Devices

When a physical drive fails in a RAID array that’s running in a level that provides redundancy—such as 1 or 5—the array goes into so-called degraded mode (never verbally abuse or be cruel to your RAID arrays!). Depending on the number of devices in the array, running in degraded mode might just have performance downsides, but it might also mean that another physical drive failure will bring down the whole array and cause total data loss. To recover the array from degraded mode, you need to add a working physical drive to the system (the old one can be removed) and instruct the array to use the new device to “rebuild.”

In order to minimize the amount of time an array spends in degraded mode, and to prevent having to power off the machine to insert new physical drives if the server doesn’t support hot-swapping, you can put extra physical drives into the machine and flag them as hot spares, which means the system will keep them active but unused until there’s a drive failure. Cold spares, as the name implies, are just extra drives that you keep around on a shelf until there’s a failure, at which point you manually add them to the array.


That’s it! The Ubuntu installer will take care of all the pesky details of configuring the system to boot the RAID arrays at the right time and use them, even if you’ve chosen to keep your root partition on an array. Now let’s look at another great feature of the Ubuntu installer: logical volume management (LVM).

The Story of the Logical Volume Manager

Let’s take a step back from our RAID adventure and look at the bigger picture in data storage. The entire situation is unpleasant. Hard drives are slow and fail often, and though abolished for working memory ages ago, fixed-size partitions are still the predominant mode of storage space allocation. As if worrying about speed and data loss weren’t enough, you also have to worry about whether your partition size calculations were just right when you were installing a server or whether you’ll wind up in the unenviable position of having a partition run out of space, even though another partition is maybe mostly unused. And if you might have to move a partition across physical volume boundaries on a running system, well, woe is you.

RAID helps to some degree. It’ll do wonders for your worries about performance and fault tolerance, but it operates at too low a level to help with the partition size or fluidity concerns. What we’d really want is a way to push the partition concept up one level of abstraction, so it doesn’t operate directly on the underlying physical media. Then we could have partitions that are trivially resizable or that can span multiple drives, we could easily take some space from one partition and tack it on another, and we could juggle partitions around on physical drives on a live server. Sounds cool, right?

Very cool, and very doable via LVM, a system that shifts the fundamental unit of storage from physical drives to virtual or logical ones (although we harbor our suspicions that the term logical is a jab at the storage status quo, which is anything but). LVM has traditionally been a feature of expensive, enterprise UNIX operating systems or was available for purchase from third-party vendors. Through the magic of free software, a guy by the name of Heinz Mauelshagen wrote an implementation of a logical volume manager for Linux in 1998. LVM has undergone tremendous improvements since then and is widely used in production today, and just as you expect, the Ubuntu installer makes it easy for you to configure it on your server during installation.

LVM Theory and Jargon

Wrapping your head around LVM is a bit more difficult than with RAID because LVM rethinks the whole way of dealing with storage, which expectedly introduces a bit of jargon that you need to learn. Under LVM, physical volumes, or PVs, are seen just as providers of disk space without any inherent organization (such as partitions mapping to a mount point in the OS). We group PVs into volume groups, or VGs, which are virtual storage pools that look like good old cookie-cutter hard drives. We carve those up into logical volumes, or LVs, that act like the normal partitions we’re used to dealing with. We create filesystems on these LVs and mount them into our directory tree. And behind the scenes, LVM splits up physical volumes into small slabs of bytes (4MB by default), each of which is called a physical extent, or a PE.

Okay, so that was a mouthful of acronyms, but as long as you understand the progression, you’re in good shape. You take a physical hard drive and set up one or more partitions on it that will be used for LVM. These partitions are now physical volumes (PVs), which are split into physical extents (PEs) and then grouped in volume groups (VGs), on top of which you finally create logical volumes (LVs). It’s the LVs, these virtual partitions, and not the ones on the physical hard drive, that carry a filesystem and are mapped and mounted into the OS. And if you’re really confused about what possible benefit we get from adding all this complexity only to wind up with the same fixed-size partitions in the end, hang in there. It’ll make sense in a second.

The reason LVM splits physical volumes into small, equally sized physical extents is that the definition of a volume group (the space that’ll be carved into logical volumes) then becomes “a collection of physical extents” rather than “a physical area on a physical drive,” as with old-school partitions. Notice that “a collection of extents” says nothing about where the extents are coming from and certainly doesn’t impose a fixed limit on the size of a volume group. We can take PEs from a bunch of different drives and toss them into one volume group, which addresses our desire to abstract partitions away from physical drives. We can take a VG and make it bigger simply by adding a few extents to it, maybe by taking them from another VG, or maybe by tossing in a new physical volume and using extents from there. And we can take a VG and move it to different physical storage simply by telling it to relocate to a different collection of extents. Best of all, we can do all this on the fly, without any server downtime.

Do you smell that? That’s the fresh smell of the storage revolution.

Setting Up LVM

By now, you must be convinced that LVM is the best thing since sliced bread. Which it is—and, surprisingly enough, setting it up during installation is no harder than setting up RAID. Create partitions on each physical drive you want to use for LVM just as you did with RAID, but tell the installer to use them as physical space for LVM. Note that in this context, PVs are not actual physical hard drives; they are the partitions you’re creating.

You don’t have to devote your entire drive to partitions for LVM. If you’d like, you’re free to create actual filesystem-containing partitions alongside the storage partitions used for LVM, but make sure you’re satisfied with your partitioning choice before you proceed. Once you enter the LVM configurator in the installer, the partition layout on all drives that contain LVM partitions will be frozen.

Let’s look back to our fictional server, but let’s give it four drives, which are 10GB, 20GB, 80GB, and 120GB in size. Say we want to create an LVM partition, or PV, using all available space on each drive, and then combine the first two PVs into a 30GB volume group and the latter two into a 200GB one. Each VG will act as a large virtual hard drive on top of which we can create logical volumes just as we would normal partitions.

As with RAID, arrowing over to the name of each drive and hitting Enter will let us erase the partition table. Then hitting Enter on the FREE SPACE entry lets us create a physical volume—a partition that we set to be used as a physical space for LVM. Once all three LVM partitions are in place, we select Configure the Logical Volume Manager on the partitioning menu.

After a warning about the partition layout, we get to a rather spartan LVM dialog that lets us modify VGs and LVs. According to our plan, we choose the former option and create the two VGs we want, choosing the appropriate PVs. We then select Modify Logical Volumes and create the LVs corresponding to the normal partitions we want to put on the system—say, one for each of /, /var, /home, and /tmp.

You can already see some of the partition fluidity that LVM brings you. If you decide you want a 25GB logical volume for /var, you can carve it out of the first VG you created, and /var will magically span the two smaller hard drives. If you later decide you’ve given /var too much space, you can shrink the filesystem and then simply move over some of the storage space from the first VG to the second. The possibilities are endless.

Last but not least, recent Ubuntu versions support encrypting your LVM volumes right from the installer, which is music to paranoid ears: It means you can now have full-disk encryption from the moment you install your machine. Encrypted LVM is offered as one of the “guided” options in the partitioning menu, but you can also accomplish the same result by hand.


Tip

LVM Doesn’t Provide Redundancy

The point of LVM is storage fluidity, not fault tolerance. In our example, the logical volume containing the /var filesystem is sitting on a volume group that spans two hard drives. Unfortunately, this means that either drive failing will corrupt the entire filesystem, and LVM intentionally doesn’t contain functionality to prevent this problem.

Instead, when you need fault tolerance, build your volume groups from physical volumes that are sitting on RAID! In our example, we could have made a partition spanning the entire size of the 10GB hard drive and allocated it to physical space for a RAID volume. Then, we could have made two 10GB partitions on the 20GB hard drive and made the first one also a physical space for RAID. Entering the RAID configurator, we would create a RAID 1 array from the 10GB RAID partitions on both drives, but instead of placing a regular filesystem on the RAID array as before, we’d actually designate the RAID array to be used as a physical space for LVM. When we get to LVM configuration, the RAID array would show up as any other physical volume, but we’d know that the physical volume is redundant. If a physical drive fails beneath it, LVM won’t ever know, and no data loss will occur. Of course, standard RAID array caveats apply, so if enough drives fail and shut down the array, LVM will still come down kicking and screaming.


Encrypted Home and Software Selection

After you have partitioned the disk, the installer will install the base system and ask you for user information, much like with the desktop install. You’ll then be asked a question you might not have seen before: Do you wish to encrypt your home directory?

If you answer in the affirmative, your account password will take on a second purpose. Rather than just allowing you to log in, it will also be used to transparently encrypt every file in your home directory, turning it into gibberish for anyone without the password. This means that if your computer gets stolen, your data remains safe from prying eyes as long as your password isn’t too easy to guess. If this sounds familiar, it’s because this functionality exists as FileVault on Apple’s Mac OS X and is also a subset of the BitLocker system that debuted in Windows Vista. (The directory encryption system used in Ubuntu is called ecryptfs, which is a decidedly less punchy name. We’re working on it.)


Tip

Encrypted Swap and Remote Login

If you use a swap partition, protecting your home directory isn’t enough; sensitive data can get swapped out to disk in the clear. The solution is to use encrypted swap, which you can manually enable with the ecryptfs-setup-swapcommand, but this will presently take away your computer’s ability to enter the hibernate power-saving mode. Suspend mode is unaffected.

Note also that encrypting your home directory makes all the data in it, including special directories such as .ssh, unavailable until after you log in. If you’re logging into a machine where your home directory is encrypted and hasn’t yet been unlocked, and the machine only allows SSH public key authentication, there is no way for the system to consult your authorized_keys file, and you’re locked out. You can fix this by physically logging in, unmounting your encrypted home directory with ecryptfs-umount-private, then creating a .ssh directory in your “underlying” home directory left behind by ecryptfs. Stick your public keys into an authorized_keys file under that .ssh directory as normal, and you’ll be all set to log in remotely, at which point you can use ecryptfs-mount-private to enter your password and unlock your actual home directory.


After the installer downloads some updated software sources, though, you will see a new menu that lists a number of common server types, including DNS, LAMP, Mail, OpenSSH, PostgreSQL, Print, and Samba servers. Select one or more of these options and the installer will automatically download the standard set of packages you will need for that server as well as perform some basic configuration of the services for you. For instance, if you wanted to install a LAMP environment, but you also wanted to make sure you could ssh into the machine from another computer, you could select both LAMP and OpenSSH server from the menu.


Tip

Software Installer Prompts

Depending on which servers you select, you may be asked a number of questions as the packages install. For instance, when you select the LAMP environment, the installer will recommend you choose a password for the root MySQL user.


You’re Done—Now Watch Out for Root!

Whew. With the storage and software stuff out of the way, the rest of your server installation should go no differently than installing a regular Ubuntu workstation. And now that your server is installed, we can move on to the fun stuff. From this point on, everything we do will happen in a shell.

When your Ubuntu server first boots, you’ll have to log in with the user you created during installation. Here’s an important point that bites a number of newcomers to Ubuntu: Unlike most distributions, Ubuntu does not enable the root account during installation! Instead, the installer adds the user you’ve created during installation to the admin group, which lets you use a mechanism called sudo to perform administrative tasks. We’ll show you how to use sudoin a bit. In the meantime, if you’re interested in the rationale for the decision to disable direct use of the root account, simply run man sudo_root after logging in.


Tip

Care and Feeding of RAID and LVM Arrays

If you’ve set up some of these during installation, you’ll want to learn how to manage the arrays after the server is installed. We recommend the respective how-to documents from The Linux Documentation Project at

www.tldp.org/HOWTO/Software-RAID-HOWTO.html and www.tldp.org/HOWTO/LVM-HOWTO.

The how-tos sometimes get technical, but most of the details should sound familiar if you’ve understood the introduction to the subject matter that we gave in this chapter.


Ubuntu Package Management

Once your server is installed, it contains only the few packages it requires to boot and run properly plus whatever software you selected at the software select screen. In the comfort of the GNOME graphical environment on an Ubuntu desktop, we could launch Synaptic and point and click our way through application discovery and installation. But on a server, we must be shell samurai.

The Ubuntu Archive

Before we delve into the nitty-gritty of package management, let’s briefly outline the structure of the master Ubuntu package archive, which we mentioned in the introduction to this chapter. Each new release has five repositories in the archive, called main, restricted, backports, universe, and multiverse. A newly installed system comes with only the first two enabled plus the security update repository. Here’s the repository breakdown.

Main: This includes all packages installed by default; these packages have official support.

Restricted: These are packages with restricted copyright, often hardware drivers.

Backports: These are newer versions of packages in the archive, provided by the community.

Universe: The universe includes packages maintained by the Ubuntu community.

Multiverse: The multiverse includes packages that are not free (in the sense of freedom).

The term official support is a bit of a misnomer, as it doesn’t refer to technical support that one would purchase or obtain but speaks instead to the availability of security updates after a version of Ubuntu is released. Standard Ubuntu releases are supported for 18 months, which means that Ubuntu’s parent company, Canonical, Ltd., guarantees that security updates will be provided, free of charge, for any vulnerabilities discovered in software in the mainrepository for 18 months after a release. No such guarantee is made for software in the other repositories.

Of particular note is that certain Ubuntu releases have longer support cycles. These releases are denoted by the acronym LTS (Long Term Support) in their version number. The latest Ubuntu LTS, version 10.04 (Lucid), will be supported for five years on servers.

APT Sources and Repositories

You’re now aware of the structure of the Ubuntu archive, but we didn’t explain how to actually modify the list of repositories you want to use on your system. In Debian package management parlance, the list of repositories is part of the list of Advanced Package Tool (APT) sources. (Keep your eyes peeled: Many of the package tools we’ll discuss below begin with the prefix apt.) These sources tell APT where to find available packages: in the Ubuntu archive on the Internet, on your CD-ROM, or in a third-party archive.

The APT sources are specified in the file /etc/apt/sources.list. Let’s open this file in an editor. (If you’re not used to vim, substitute nano for it, which is an easier-to-use, beginner-friendly editor.)

$ vim /etc/apt/sources.list

The lines beginning with a hash, or #, denote comment lines and are skipped over by APT. At the top, you’ll see the CD-ROM source that the installer added, and following it these two lines (or something very similar):

deb http://us.archive.ubuntu.com/ubuntu/ natty main restricted
deb-src http://us.archive.ubuntu.com/ubuntu/ natty main restricted

We can infer the general format of the APT sources list by looking at these lines. The file is composed of individual sources, one per line, and each line of several space-separated fields. The first field tells us what kind of a source the line is describing, such as a source for binary packages (deb) or source code packages (deb-src). The second field is the actual URI of the package source, the third names the distribution whose packages we want (lucid), and the remaining fields tell APT which components to use from the source we’re describing—by default, main and restricted.

If you look through the rest of the file, you’ll find it’s nicely commented to let you easily enable two extra repositories: the very useful universe and the bleeding-edge backports. In general, now that you understand the format of each source line, you have complete control over the repositories you use, and while we strongly recommend against using the backports repository on a server, enabling universe is usually a good idea.

With that in mind, let’s get you acquainted with some of the basic command-line package management tools on an Ubuntu system. Ubuntu inherits its package management from Debian, so if you’re familiar with Debian, the utilities we’ll discuss are old friends.

dpkg

Our first stop is the Debian package manager, dpkg, which sits around the lowest levels of the package management stack. Through a utility called dpkg-deb, dpkg deals with individual Debian package files, referred to as debs for their .deb filename extension.

dpkg is extensively documented in the system manual pages, so you can read about the various options it supports by entering man dpkg in the shell. We’ll point out the most common dpkg operations: listing and installing packages. Of course, dpkg can also remove packages, but we’ll show you how to do that with the higher-level tool called apt-get instead.

Listing Packages

Running dpkg -l | less in the shell will list all the packages on your system that dpkg is tracking, in a six-column format. The first three columns are one letter wide each, signifying the desired package state, current package status, and error status, respectively. Most of the time, the error status column will be empty.

The top three lines of dpkg output serve as a legend to explain the letters you can find in the first three columns. This lets you use the grep tool to search through the package list, perhaps to look only at removed packages or those that failed configuration.

Installing a Package Manually

There are more than 17,000 packages in the Ubuntu archive for each release. Only a small percentage of those are officially supported, but all the other packages are still held to reasonably rigorous inclusion requirements. Packages in the Ubuntu archive are thus almost universally of high quality and are known to work well on your Ubuntu system.

Because of this, the archive should be the very first place you look when you choose to install new software. On rare instances, however, the software you want to install won’t be available in the archive because it’s new or because redistribution restrictions prevent it from being included. In those cases, you might have to either build the software from source code, run binaries that the vendor provides, or find third-party Ubuntu or Debian packages to install.


Tip

Practice Safe Hex!

That’s a terrible pun. We apologize. But it probably got your attention, so follow closely: Be very, very cautious when dealing with third-party packages. Packages in the Ubuntu archive undergo extensive quality assurance and are practically certain to be free from viruses, worms, Trojan horses, or other computer pests. If you install software only from the archive, you’ll never have to worry about viruses again.

With third-party packages, you just don’t know what you could be installing. If you install a malicious package, you’ve given the package creator full control of your system. So ideally, don’t install third-party packages at all. And if you must, make absolutely sure you trust the source of the packages!


Impatience is a hallmark virtue of programmers and system administrators alike, so if you were too impatient to read the warning note, do it now. This is serious business. Let’s continue: Say you’ve downloaded a package called myspecial-server.deb. You can install it simply by typing:

$ sudo dpkg -i myspecial-server.deb

dpkg will unpack the deb, make sure its dependencies are satisfied, and proceed to install the package. Remember what we said about the root account being unusable by default? Installing a package requires administrator privileges, which we obtained by prefixing the command we wanted to execute with sudo and entering our user password at sudo’s prompt.


Tip

A Quick Note on Shell Examples

In the dpkg example, the dollar sign is the standard UNIX shell symbol, so you don’t need to actually type it. We’ll use it in the rest of the chapter to indicate things that need to be entered in a shell. On your Ubuntu system, the shell prompt won’t be just a dollar sign but will look like this:

user@server:~$

user and server will be replaced by your username and the hostname you gave the server during installation, respectively, and the part between the colon and dollar sign will show your working directory. A tilde is UNIX shorthand for your home directory.


apt-get and apt-cache

Now let’s jump higher up in the stack. Whereas dpkg deals mostly with package files, apt-get knows how to download packages from the Ubuntu archive or fetch them from your Ubuntu CD. It provides a convenient, succinct interface, so it’s no surprise it’s the tool that most system administrators use for package management on Ubuntu servers.

While apt-get deals with high-level package operations, it won’t tell you which packages are actually in the archive and available for installation. It knows how to get this information behind the scenes from the package cache, which you can manipulate by using a simple tool called apt-cache. Let’s see how these two commands come together with an example. Say we’re trying to find and then install software that lets us work with extended filesystem attributes.

Searching the Package Cache and Showing Package Information

We begin by telling apt-cache to search for the phrase “extended attributes.”

$ apt-cache search "extended attributes"
attr - Utilities for manipulating filesystem extended attributes
libattr1 - Extended attribute shared library
libattr1-dev - Extended attribute static libraries and headers
python-pyxattr - module for manipulating filesystem extended
attributes
python2.4-pyxattr - module for manipulating filesystem extended
attributes
rdiff-backup - remote incremental backup
xfsdump - Administrative utilities for the XFS filesystem
xfsprogs - Utilities for managing the XFS filesystem

The parameter to apt-cache search can be either a package name or a phrase describing the package, as in our example. The lines following our invocation are the output we received, composed of the package name on the left and a one-line description on the right. It looks like the attr package is what we’re after, so let’s see some details about it.

$ apt-cache show attr
Package: attr
Priority: optional
Section: utils
Installed-Size: 240
Maintainer: Ubuntu Core Developers <ubuntu-deel-
discuss@lists.ubuntu.com>
Original-Maintainer: Nathan Scott <nathans@debian.org>
Architecture: i386
Version: 1:2.4.39-1
Depends: libattr1 (>= 2.4.4-1), libc6 (>= 2.6.1-1)
Conflicts: xfsdump (<< 2.0.0)
Filename: pool/main/a/attr/attr_2.4.39-1_i386.deb
Size: 31098
MD5sum: 84457d6edd44983bba3dcb50495359fd
SHA1: 8ae3562e0a8e8a314c4c6997ca9aced0fb3bea46
SHA256:
f566a9a57135754f0a79c2efd8fcec626cde10d2533c10c1660bf7064a336c82
Description: Utilities for manipulating filesystem extended
attributes
A set of tools for manipulating extended attributes on filesystem
objects, in particular getfattr(1) and setfattr(1).
An attr(1) command is also provided which is largely compatible
with the SGI IRIX tool of the same name.
.
Homepage: http://oss.sgi.com/projects/xfs/
Bugs: mailto:ubuntu-users@lists.ubuntu.com
Origin: Ubuntu

Don’t be daunted by the verbose output. Extracting the useful bits turns out to be pretty simple. We can already see from the description field that this is, in fact, the package we’re after. We can also see the exact version of the packaged software, any dependencies and conflicting packages it has, and an e-mail address to which we can send bug reports. And looking at the filename field, the pool/main snippet tells us this is a package in the main repository.

Installing a Package

So far, so good. Let’s perform the actual installation:

$ sudo apt-get install attr

apt-get will track down a source for the package, such as an Ubuntu CD or the Ubuntu archive on the Internet, fetch the deb, verify its integrity, do the same for any dependencies the package has, and, finally, install the package.

Removing a Package

For didactic purposes, we’re going to keep assuming that you’re very indecisive and that right after you installed the attr package, you realized it wasn’t going to work out between the two of you. To the bit bucket with attr!

$ sudo apt-get remove attr

One confirmation later and attr is blissfully gone from your system, except for any configuration files it may have installed. If you want those gone, too, you’d have to instead run the following:

$ sudo apt-get --purge remove attr

Performing System Updates

Installing and removing packages is a common system administration task, but not as common as keeping the system up to date. This doesn’t mean upgrading to newer and newer versions of the software (well, it does, but not in the conventional sense), because once a given Ubuntu version is released, no new software versions enter the repositories except for the backports repository. On a server, however, you’re strongly discouraged from using backports because they receive a very limited amount of quality assurance and testing and because there’s usually no reason for a server to be chasing new software features. New features bring new bugs, and as a system administrator, you should value stability and reliability miles over features. Ubuntu’s brief, six-month development cycle means that you’ll be able to get all the new features in half a year anyway. But by then they will be in the main repositories and will have received substantial testing. Keeping a system up to date thus means making sure it’s running the latest security patches, to prevent any vulnerabilities discovered after the release from endangering your system.

Luckily, apt-get makes this process amazingly easy. You begin by obtaining an updated list of packages from the Ubuntu archive:

$ sudo apt-get update

and then you simply run the upgrade:

$ sudo apt-get upgrade

After this, apt-get will tell you either that your system is up to date or what it’s planning to upgrade, and it will handle the upgrade for you automatically. How’s that for cool?

Running a Distribution Upgrade

When a new Ubuntu release comes out and you want to upgrade your server to it, you’ll use a new tool, do-release-upgrade. The upgrade tool will switch over your sources.list to the new distribution and will figure out what packages are needed and whether they have any known issues. After it has done this, it will ask you to confirm the update by pressing y or to view the updated packages by pressing d. If you choose to view the updates, merely type y to continue the update, as the tool will not prompt you again.


Note

The update process may take a couple of hours and should not be interrupted during that time.


Building Packages from Source

The Ubuntu archive, unlike Debian’s, doesn’t permit direct binary uploads. When Ubuntu developers want to add a piece of software to the archive, they prepare its source code in a certain way and put it in a build queue. From there it’s compiled, built automatically, and—if those steps succeed—pushed into the archive.

Why go through all the trouble? Why not just have the developers build the software on their machines? They could upload binaries to the archive, bypassing the build queue, which can take hours to build software. Here’s the catch: Ubuntu officially supports three hardware platforms (Intel x86, AMD64/EM64T, and PowerPC). Without the build queue, developers would have to build separate binaries of their software for each platform, which entails owning a computer running on each platform (expensive!) or creating complicated cross-compilation toolchains. And even then, sitting through three software builds is an enormous waste of precious developer time.

The build queue approach solves this problem because the automatic build system takes a single source package and builds it for all the necessary platforms. And it turns out that the approach provides you, the system administrator, with a really nifty benefit: It lets you leverage the dependency-solving power and ease of use of apt-get and apply it to building packages from source!

Now that you’re excited, let’s backtrack a bit. Building packages from source is primarily of interest to developers, not system administrators. In fact, as a sysadmin, you should avoid hand-built packages whenever possible and instead benefit from the quality assurance that packages in the Ubuntu archive received. Sometimes, though, you might just have to apply a custom patch to a piece of software before installing it. We’ll use the attr package example, as before. What follows is what a session of building attr from source and installing the new package might look like—if you want to try it, make sure you install the dpkg-dev, devscripts, and fakeroot packages.


$ mkdir attr-build
$ cd attr-build
$ apt-get source attr
$ sudo apt-get build-dep attr
$ cd attr-2.4.39
<apply a patch or edit the source code>
$ dch -i
$ dpkg-buildpackage -rfakeroot
$ cd ..
$ sudo dpkg -i *.deb

All of the commands we invoked are well documented in the system man pages, and covering them in detail is out of the scope of this chapter. To briefly orient you as to what we did, though, here’s a quick description.

1. We made a scratch directory called attr-build and changed into it.

2. apt-get source attr fetched the source of the attr package and unpacked it into the current directory.

3. apt-get build-dep attr installed all the packages required to build the attr package from source.

4. We changed into the unpacked attr-2.4.25 directory, applied a patch, and edited the package changelog to describe our changes to the source.

5. dpkg -buildpackage -rfakeroot built one or more installable debs from our package.

6. We ascended one directory in the filesystem and installed all the debs we just built.

This is a super-compressed cheat sheet for a topic that takes a long time to master. We left a lot of things out, so if you need to patch packages for production use, first go and read the man pages of the tools we mentioned and get a better understanding of what’s going on!

aptitude

Around the highest levels of the package management stack hangs aptitude, a neat, colorful textual front end that can be used interchangeably with apt-get. We won’t go into detail about aptitude use here; plenty of information is available from the system manual pages and the online aptitude help system (if you launch it as aptitude from the shell). It’s worth mentioning, though, that one of the chief reasons some system administrators prefer aptitude over apt-get is its better handling of so-called orphan packages. Orphan packages are packages that were installed as a dependency of another package that has since been removed, leaving the orphan installed for no good reason. apt-getprovides no automatic way to deal with orphans, instead relegating the task to the deborphan tool, which you can install from the archive. By contrast, aptitude will remove orphan packages automatically.

Tips and Tricks

Congratulations. If you’ve gotten this far, you’re familiar with most aspects of effectively dealing with packages on your Ubuntu server. Before you move on to other topics, though, we want to present a few odds and ends that will probably come in handy to you at one point or another.

Listing Files Owned by a Package

Sometimes it’s really useful to see which files on your system belong to a specific package, say, cron. dpkg to the rescue:

$ dpkg -L cron

Be careful, though, as dpkg -L output might contain directories that aren’t exclusively owned by this package but are shared with others.

Finding Which Package Owns a File

The reverse of the previous operation is just as simple:

$ dpkg -S /etc/crontab
cron: /etc/crontab

The one-line output tells us the name of the owner package on the left.

Finding Which Package Provides a File

Both dpkg -S and dpkg -L operate on the database of installed packages. Sometimes, you might need to figure out which—potentially uninstalled—package provides a certain file. We might be looking for a package that would install the bzr binary, or /usr/bin/bzr. To do this, first install the package apt-file (requires the universe repository), then execute:

$ apt-file update
$ apt-file search /usr/bin/bzr

Voila! apt-file will tell you that the package you want is bzr, with output in the same format as dpkg -S.

That’s it for our package management tricks—now it’s time to talk about security.

Ubuntu Server Security

As a system administrator, one of your chief tasks is dealing with server security. If your server is connected to the Internet, for security purposes it’s in a war zone. If it’s only an internal server, you still need to deal with (accidentally) malicious users, disgruntled employees, and the guy in accounting who really wants to read the boss’s secretary’s e-mail.

In general, Ubuntu Server is a very secure platform. The Ubuntu Security Team, the team that produces all official security updates, has one of the best turnaround times in the industry. Ubuntu ships with a no open ports policy, meaning that after you install Ubuntu on your machine—be it an Ubuntu desktop or a server installation—no applications will be accepting connections from the Internet by default. Like Ubuntu desktops, Ubuntu Server uses the sudo mechanism for system administration, eschewing the root account. And finally, security updates are guaranteed for at least 18 months after each release (five years for some releases, like Dapper), and are free.

In this section, we want to take a look at user account administration, filesystem security, system resource limits, logs, and finally some network security. But Linux security is a difficult and expansive topic; remember that we’re giving you a crash course here and leaving out a lot of things—to be a good administrator, you’ll want to learn more.

User Account Administration

Many aspects of user administration on Linux systems are consistent across distributions. Debian provides some convenience tools, such as the useradd command, to make things easier for you. But since Ubuntu fully inherits Debian’s user administration model, we won’t go into detail about it here. Instead, let us refer you to www.oreilly.com/catalog/debian/chapter/book/ch07_01.html for the basics. After reading that page, you’ll have full knowledge of the standard model, and we can briefly talk about the Ubuntu difference: sudo.

As we mentioned at the end of the installation section (You’re Done—Now Watch Out for Root!), Ubuntu doesn’t enable the root, or administrator, account by default. There is a great deal of security benefit to this approach and incredibly few downsides, all of which are documented at the man pages for sudo_root.

The user that you added during installation is the one who, by default, is placed into the admin group and may use sudo to perform system administration tasks. After adding new users to the system, you may add them to the admin group like this:

$ sudo adduser username admin

Simply use deluser in place of adduser in the above command to remove a user from the group. (Adding the --encrypt-home option to adduser will automatically set up home directory encryption for the new user.)

One thing to keep in mind is that sudo isn’t just a workaround for giving people root access. sudo can also handle fine-grain permissions, such as saying, “Allow this user to execute only these three commands with super-user privileges.”

Documentation about specifying these permissions is available in the sudoers man page, which can be a bit daunting—feel free to skip close to the end of it, until you reach the EXAMPLES section. It should take you maybe 10 or 15 minutes to grok it, and it covers a vast majority of the situations for which you’ll want sudo. When you’re ready to put your new knowledge to use, simply run:

$ visudo

Be careful here—the sudoers database, which lives in /etc/sudoers, is not meant to just be opened in an editor because an editor won’t check the syntax for you! If you mess up the sudoer’s database, you might find yourself with no way to become an administrator on the machine.

Filesystem Security

The security model for files is standardized across most UNIX-like operating systems and is called the POSIX model. The model calls for three broad types of access permissions for every file and directory: owner, group, and other. It works in exactly the same way on any Linux distribution, which is why we won’t focus on it here. For a refresher, consult the man pages for chmod and chown, or browse around the Internet.

We want to actually look at securing partitions through mount options, an oft-neglected aspect of dealing with system security that’s rather powerful when used appropriately. When explaining how to partition your system, we extolled the virtues of giving, at the very least, the /home, /tmp, and /var directories their own partitions, mentioning how it’s possible to use special options when mounting these to the filesystem.

Many of the special mount options are filesystem-dependent, but the ones we want to consider are not. Here are the ones that interest us.

nodev

A filesystem mounted with the nodev option will not allow the use or creation of special “device” files. There’s usually no good reason to allow most filesystems to allow interpretation of block or character special devices, and allowing them poses potential security risks.

nosuid

If you read up about UNIX file permissions, you know that certain files can be flagged in a way that lets anyone execute them with the permissions of another user or group, often that of the system administrator. This flag is called the setuid (suid) or the setgid bit, respectively, and allowing this behavior outside of the directories that hold the system binaries is often unnecessary and decreases security. If a user is able to, in any way, create or obtain a setuid binary of his or her own choosing, the user has effectively compromised the system.

noexec

If a filesystem is flagged as noexec, users will not be able to run any executables located on it.

noatime

This flag tells the filesystem not to keep a record of when files were last accessed. If used indiscriminately, it lessens security through limiting the amount of information available in the event of a security incident, particularly when computer forensics is to be performed. However, the flag does provide performance benefits for certain use patterns, so it’s a good candidate to be used on partitions where security is an acceptable tradeoff for speed.

Deciding which mount options to use on which partition is another fuzzy science, and you’ll often develop preferences as you become more accustomed to administering machines. Here’s a basic proposal, though, that should be a good starting point:

• /home: nosuid, nodev

• /tmp: noatime, noexec, nodev, nosuid

• /var: noexec, nodev, nosuid

System Resource Limits

By default, Linux will not impose any resource limits on user processes. This means any user is free to fill up all of the working memory on the machine, or spawn processes in an endless loop, rendering the system unusable in seconds. The solution is to set up some of your own resource limits by editing the /etc/security/limits.conf file:

$ sudoedit /etc/security/limits.conf

The possible settings are all explained in the comment within the file, and there are no silver bullet values to recommend, though we do recommend that you set up at least the nproc limit and possibly also the as/data/memlock/rss settings.


Tip

A Real-Life Resource Limit Example

Just to give you an idea of what these limits look like on production servers, here is the configuration from the general login server of the Harvard Computer Society at Harvard University:

* - as 2097152
* - data 131072
* - memlock 131072
* - rss 1013352
* hard nproc 128

This limits regular users to 128 processes, with a maximum address space of 2GB, maximum data size and locked-in-memory address space of 128MB, and maximum resident set size of 1GB.


If you need to set up disk quotas for your users, install the quota package, and take a look at its man page.

System Log Files

As a system administrator, the system log files are some of your best friends. If you watch them carefully, you’ll often know in advance when something is wrong with the system, and you’ll be able to resolve most problems before they escalate.

Unfortunately, your ability to pay close attention to the log files dwindles with every server you’re tasked with administering, so administrators often use log-processing software that can be configured to alert them on certain events, or they write their own tools in languages such as Perl and Python.

Logs usually live in /var/log, and after your server runs for a while, you’ll notice there are a lot of increasingly older versions of the log files in that directory, many of them compressed with gzip (ending with the .gz filename extension).

Here are some log files of note:

• /var/log/syslog: general system log

• /var/log/auth.log: system authentication logs

• /var/log/mail.log: system mail logs

• /var/log/messages: general log messages

• /var/log/dmesg: kernel ring buffer messages, usually since system bootup

Your Log Toolbox

When it comes to reviewing logs, you should become familiar with a few tools of choice. The tail utility prints, by default, the last ten lines of a file, which makes it a neat tool to get an idea of what’s been happening last in a given log file:

$ tail /var/log/syslog

With the -f parameter, tail launches into follow mode, which means it’ll open the file and keep showing you changes on the screen as they’re happening. If you want to impress your friends with your new system administrator prowess, you can now easily recreate the Hollywood hacker movie staple: text furiously blazing across the screen.

Also invaluable are zgrep, zcat, and zless, which operate like their analogues that don’t begin with a z, but on gzip-compressed files. For instance, to get a list of lines in all your compressed logs that contain the word “warthog” regardless of case, you would issue the following command:

$ zgrep -i warthog /var/log/*.gz

Your toolbox for dealing with logs will grow with experience and based on your preferences, but to get an idea of what’s already out there, do an apt-cache search for “log files.”

A Sprinkling of Network Security

Network security administration is another feature provided largely by the OS, so it’s no different on Ubuntu than on any other modern Linux distribution. That means we won’t cover it here but will leave you with a pointer.

The iptables command is the front end to the very powerful Linux firewall tables. Unfortunately, dealing with iptables can be rather difficult, particularly if you’re trying to set up complex firewall policies. To whet your appetite, here’s iptables in action, dropping all packets coming from a notorious time-sink domain:

$ sudo iptables -A INPUT -s www.slashdot.org -j DROP

Tutorials, how-tos, and articles about iptables are available on the Internet in large numbers, and the system man pages provide detailed information about all the possible options. Spending some time to learn iptables is well worth it because it’ll let you set up network security on any Linux machine and will make it pretty easy for you to learn other operating systems’ firewall systems if need be.

If you want to just manage a basic firewall on Ubuntu Server, you don’t necessarily even need to venture into iptables. Ubuntu provides an excellent front-end called ufw that makes it very easy to add new firewall rules. For more information on ufw, check out the man page for that tool, or if you want a more complete reference, look at the security section of The Official Ubuntu Server Book.

Final Words on Security

We’ve barely even scratched the surface of system security in this subsection, though we’ve tried to give you good pointers on where to start and where to get the information you need to learn more. But let us give you some sage advice on security in general, since it’s a painful truth to learn: There is no such thing as a fully secure system. Securing systems isn’t about making it impossible for a breach to occur. It’s about making the breach so difficult that it’s not worth it to the attacker. This definition is pretty fluid because if your attacker is a bored 14-year-old sitting in a basement somewhere chewing on cold pizza, you can bet that kid will leave your system alone if it’s even marginally secure. But if you’re keeping around top-secret information, it’s a lot more difficult to have the system be secure enough that breaking into it isn’t worth it, from a cost/benefit point of view, to the attackers.

Security is also neat because, as a concept, it permeates the entire idea space of computer science. Getting really good at security requires an incredibly deep understanding of the inner workings of computer systems, which has the nonobvious advantage that if you’re trying to get a deep understanding of computer systems but don’t know where to start, you can start with security and simply follow the trail. Use this to your advantage! Good luck.


Tip

Getting In Touch

If you want to tell us why you like Ubuntu Server, or why you hate it, or send us cookies, or just stalk us from a distance, come on in! Go to

https://lists.ubuntu.com/mailman/listinfo/ubuntu-server

to join the ubuntu-server mailing list, visit our page on Launchpad at

https://launchpad.net/people/ubuntu-server,

or jump on IRC. We’re on the #ubuntu-server channel on FreeNode. Hope to see you there!


Advanced Topics

A single book chapter isn’t the right place to go into great detail on all the features packed into Ubuntu Server. There isn’t enough space, and many of the features are quite specialized. But that doesn’t stop us from taking you on a whirlwind tour. Our goal here is to give just enough information to let you know what’s there and interest you in finding out more about those features that may be relevant to how you use Ubuntu.

Virtualization

If there’s been one buzzword filling out the server space for the past couple of years, it’s virtualization. In August 2007, a virtualization company called VMware raised about a billion U.S. dollars in its initial public offering, and the term virtualization finally went supernova, spilling from the technology realm into the financial mainstream, and soon to CIOs and technology managers everywhere.

Fundamentally, virtualization is a way to turn one computer into many. (Erudite readers will note this is precisely the opposite of the Latin motto on the Seal of the United States, “E Pluribus Unum,” which means “out of many, one.” Some technologies match that description, too, like Single System Image, or SSI, grids. But if we talked about virtualization in Latin, it would be “Ex Uno Plura.”) Why is it useful to turn one computer into many?

Back in the 1960s, servers were huge and extremely expensive, and no one wanted to buy more of them than they absolutely needed. It soon became clear that a single server, capable of running different operating systems at once, would allow the same hardware to be used by different people with different needs, which meant fewer hardware purchases, which meant happier customers with less devastated budgets. IBM was the first to offer this as a selling point, introducing virtualization in its IBM 7044 and IBM 704 models, and later in the hardware of its Model 67 mainframe. Since then, the industry largely moved away from mainframes and toward small and cheap rack servers, which meant the need to virtualize mostly went away: If you needed to run separate operating systems in parallel, you just bought two servers. But eventually Moore’s law caught up with us, and even small rack machines became so powerful that organizations found many of them underutilized, while buying more servers (though cheap in itself) meant sizable auxiliary costs for cooling and electricity. This set the stage for virtualization to once again become vogue. Maybe you want to run different Linux distributions on the same machine. Maybe you need a Linux server side by side with Windows. Virtualization delivers.

There are four key types of virtualization. From the lowest level to highest, they are hardware emulation, full virtualization, paravirtualization, and OS virtualization. Hardware emulation means running different operating systems by emulating, for each, all of a computer’s hardware in software. The approach is very powerful and painfully slow. Full virtualization instead uses a privileged piece of software called a hypervisor as a broker between operating systems and the underlying hardware, and it offers good performance but requires special processor support on instruction sets like the ubiquitous x86. Paravirtualization also uses a hypervisor but supports only executing operating systems that have been modified in a special way, offering high performance in return. Finally, OS virtualization is more accurately termed “containerization” or “zoning” and refers to operating systems that support multiple user spaces utilizing a single running kernel. Containerization provides near-native performance but isn’t really comparable to the other virtualization approaches because its focus isn’t running multiple operating systems in parallel but carving one up into isolated pieces.

The most widely used hardware emulators on Linux are QEMU and Bochs, available in Ubuntu as packages qemu and bochs respectively. The big players in full virtualization on Linux are the commercial offerings from VMware, IBM’s z/VM, and most recently, a technology called KVM that’s become part of the Linux kernel. In paravirtualization, the key contender is Xen; the Linux OS virtualization space is dominated by the OpenVZ and Linux-VServer projects, though many of the needed interfaces for OS virtualization have gradually made their way into the Linux kernel proper.

Now that we’ve laid the groundwork, let’s point you in the right direction depending on what you’re looking for. If you’re a desktop Ubuntu user and want a way to safely run one or more other Linux distributions (including different versions of Ubuntu!) or operating systems (BSD, Windows, Solaris, and so forth) for testing or development, all packaged in a nice interface, the top recommendation is an open source project out of Sun Microsystems called VirtualBox. It’s available in Ubuntu as the package virtualboxose, and its home page is www.virtualbox.org.

If you want to virtualize your server, the preferred solution in Ubuntu is KVM, a fast full virtualizer that turns the running kernel into a hypervisor. Due to peculiarities of the x86 instruction set, however, full virtualizers can work only with a little help from the processor, and KVM is no exception. To test whether your processor has the right support, try:

$ egrep '(vmx|svm)' /proc/cpuinfo

If that line produces any output, you’re golden. Head on over to https://help.ubuntu.com/community/KVM for instructions on installing and configuring KVM and its guest operating systems.

If you lack the processor support for KVM, you don’t have great options. Ubuntu releases after Hardy (8.04) no longer offer kernels capable of hosting Xen guests (dom0 kernels aren’t provided, in Xen parlance), which means if you’re desperate to get going with Xen, you’ll have to downgrade to Hardy or get your hands quite dirty in rolling the right kind of kernel yourself, which is usually no small task.


Tip

Point-and-Click Xen

One Xen-related project to point out is MIT’s open source XVM (not to be confused with Sun Microsystems’ xVM), which is a set of tools built on top of Debian that allow users to create and bring up Xen guests through a Web browser, complete with serial console redirection, ssh access, and a variety of other goodies. MIT uses the system to offer point-and-click virtual machines to any MIT affiliate; the project home page is http://xvm.mit.edu.


Disk Replication

We’ve discussed the role of RAID in protecting data integrity in the case of disk failures, but we didn’t answer the follow-up question: What happens when a whole machine fails? The answer depends entirely on your use case, and giving a general prescription doesn’t make sense. If you’re Google, for instance, you have automated cluster management tools that notice a machine going down and don’t distribute work to it until a technician has been dispatched to fix the machine. But that’s because Google’s infrastructure makes sure that (except in pathological cases) no machine holds data that isn’t replicated elsewhere, so the failure of any one machine is ultimately irrelevant.

If you don’t have Google’s untold thousands of servers on a deeply redundant infrastructure, you may consider a simpler approach: Replicate an entire hard drive to another computer, propagating changes in real time, just like RAID1 but over the network.

This functionality is called DRBD, or Distributed Replicated Block Device, and it isn’t limited to hard drives: It can replicate any block device you like. Ubuntu 9.04 and newer ships with DRBD, and the user space utilities you need are in the drbd8-utils package. For the full documentation, see the DRBD Web site at www.drbd.org.

Cloud Computing

Slowly but surely overtaking virtualization as the uncontested hottest topic in IT, cloud computing is just a new term for an old idea: on-demand or “pay-as-you-go” computing. Building and managing IT infrastructure aren’t the core competencies of most organizations, the theory goes, and it’s hard to predict how much computing capacity you’ll need at any given time: If your company store transitions from wallowing in relative obscurity to becoming an overnight Internet sensation, what do you do? Buy up a truckload of new servers, ship them overnight, and work your IT staff to a pulp to bring all this new infrastructure up in as little time as possible? In the interim, your customers are overwhelming your existing capacity and getting frustrated by the slow response times. In the worst case scenario, by the time you have the new hardware running, customer interest has ebbed away, and you’re now stuck having paid for a ton of extra hardware doing nothing at all. Cloud computing is the promise of a better way. Instead of dealing with IT infrastructure yourself, why not rent only the amount of it you need at any given moment from people whose job it is to deal with IT infrastructure, like Amazon or Google?

Cloud services like Amazon’s S3 and EC2 and Google’s App Engine offer exactly that. And Ubuntu is getting in on the action in two ways. Ubuntu is offering a program wherein Ubuntu images can be deployed to existing Amazon EC2 instances, allowing you to run Ubuntu servers on Amazon’s infrastructure. More interestingly, Ubuntu bundles a set of software called Eucalyptus (http://eucalyptus.cs.ucsb.edu) that allows you to create an EC2-style cloud on your own hardware while remaining interface-compatible with Amazon’s. Such a setup offers savvy larger organizations the ability to manage their own infrastructure in a much more efficient way and makes it possible for even small infrastructure shops to become cloud service providers and compete for business with the big boys.

Summary

If you’ve never administered a system before, the transition from being a regular user will be difficult, regardless of which OS you choose to learn to administer. The difficulty stems from the wider shift in thinking that’s required. Instead of just making sure your room is clean, now you have to run and protect the whole apartment building. But the difficulty is also educational and rewarding. (We realize they also told you this for your theoretical physics class in college, but we are not lying.) Learning to maintain Ubuntu servers is a great choice for you because you’ll benefit from a vibrant and helpful user community, and you’ll be working with a top-notch OS every step of the way.

If you’re a seasoned administrator who came to see what all the Ubuntu Server fuss is about, stay tuned. The project, though rock solid as far as stability goes, is still in its feature infancy, and the Server Team is working very hard at making it the best server platform out there. We’re emphasizing advanced features and we’re being very fussy about getting all the little details just right.

In both cases, if you’re installing a new server, give Ubuntu Server a try. It’s a state-of-the-art system, and we’re sure you’ll enjoy using it. Get in touch, tell us what to do to make it better, and lend a hand. Help us make Ubuntu rock even harder on big iron and heavy metal!