Backing Up - System Administration - Ubuntu Unleashed 2017 Edition (2017)

Ubuntu Unleashed 2017 Edition (2017)

Part III: System Administration

Chapter 17. Backing Up


In This Chapter

Image Choosing a Backup Strategy

Image Choosing Backup Hardware and Media

Image Using Backup Software

Image Copying Files

Image Version Control for Configuration Files

Image System Rescue

Image References


This chapter examines the practice of safeguarding data by creating backup copies, restoring that same data if necessary, and recovering data in case of a catastrophic hardware or software failure. The chapter gives you a full understanding of the reasons for sound backup practices. You can use the information here to make intelligent choices about which strategies are best for you. The chapter also shows you how to perform some types of data recovery and system restoration on your own and provides advice about when to seek professional assistance.

Choosing a Backup Strategy

Backups are always trade-offs. Any backup consumes time, money, and effort on an ongoing basis; backups must be monitored, validated, indexed, and stored, and you must continuously purchase new media. Sound expensive? The cost of not having backups is the loss of your critical data. Re-creating the data from scratch costs time and money, and if the cost of doing it all again is greater than the cost associated with backing up, you should be performing backups. At their most basic, backups are nothing more than insurance against financial loss for you or your business.

Your first step in formulating and learning to use an effective backup strategy is to choose the strategy that is right for you. First, you must understand some of the most common (and not so common) causes of data loss so that you are better able to understand the threats your system faces. Then, you need to assess your own system, how it is used and by whom, your available hardware and software resources, and your budget constraints. The following sections look at each of these issues in detail and provide some backup system examples and discuss their use.

Why Data Loss Occurs

Files disappear for any number of reasons: They can be lost because the hardware fails and causes data loss; your attention might wander and you accidentally delete or overwrite a file. Some data loss occurs as a result of natural disasters, and other circumstances beyond your control. A tornado, flood, or earthquake could strike, the water pipes could burst, or the building could catch on fire. Your data, as well as the hardware, would likely be destroyed in such a disaster. A disgruntled employee might destroy files or hardware in an attempt at retribution. Equipment can be stolen. And any equipment might fail; all equipment fails at some time—most likely when it is extremely important for it not to fail.


A Case in Point

A recent Harris poll of Fortune 500 executives found that roughly two-thirds of them had problems with their backups and disaster recovery plans. How about you?


Data can also be lost because of malfunctions that corrupt the data as it attempts to write to the disk. Other applications, utilities, and drivers might be poorly written, buggy (the phrase most often heard today is “still beta quality”), or might suffer some corruption and fail to correctly write that all-important data you have just created. If that happened, the contents of your data file would be indecipherable garbage of no use to anyone.

All these accidents and disasters offer important reasons for having a good backup strategy; however, the most frequent cause of data loss is human error. Who among us has not overwritten a new file with an older version or unintentionally deleted a needed file? This applies not only to data files, but also to configuration files and binaries. While perusing the mail lists, Usenet newsgroup postings, or online forums, stories about deleting entire directories such as /home, /usr, or /lib seem all too common. On a stable server that is not frequently modified or updated, you can choose to mount /usr read-only to prevent writing over or deleting anything in it. Incorrectly changing a configuration file and not saving the original in case it has to be restored (which happens more often than not because the person reconfigured it incorrectly) is another common error.


Tip

To make a backup of a configuration file you are about to edit, use the cp command:

Click here to view code image

matthew@seymour:~$ cp filename filename.original

To restore it, use the following:

Click here to view code image

matthew@seymour:~$ cp filename.original filename

Never edit or move the *.original file, or the original copy will be lost. You can change the file’s mode to be unwriteable; then if you try to delete it, you are prevented from doing so and receive a warning.


Proper backups can help you recover from these problems with a minimum of hassle, but you have to put in the effort to keep backups current, verify they are intact, and practice restoring the data in different disaster scenarios.

Assessing Your Backup Needs and Resources

By now you have realized that some kind of plan is needed to safeguard your data, and, like most people, you are overwhelmed by the prospect. Entire books, as well as countless articles and white papers, have been written on the subject of backing up and restoring data. What makes the topic so complex is that each solution is truly individual.

Yet, the proper approach to making the decision is very straightforward. You start the process by asking the following:

Image What data must be safeguarded?

Image How often does the data change?

The answers to these two questions determine how important the data is, determine the volume of the data, and determine the frequency of the backups. This in turn determines the backup medium. Only then can the software be selected to accommodate all these considerations. (You learn about choosing backup software, hardware, and media later in this chapter.)

Available resources are another important consideration when selecting a backup strategy. Backups require time, money, and personnel. Begin your planning activities by determining what limitations you face for all these resources. Then, construct your plan to fit those limitations, or be prepared to justify the need for more resources with a careful assessment of both backup needs and costs.


Tip

If you are not willing or capable of assessing your backup needs and choosing a backup solution, legions of consultants, hardware vendors, and software vendors would love to assist you. The best way to choose one in your area is to ask other UNIX and Linux system administrators (located through user groups, discussion groups, or mail lists) who are willing to share their experiences and make recommendations. If you cannot get a referral, ask the consultant for references and check them out.


Many people also fail to consider the element of time when formulating their plan. Some backup devices are faster than others, and some recovery methods are faster than others. You need to consider that when making choices.

To formulate your backup plan, you need to determine the frequency of backups. The necessary frequency of backups should be determined by how quickly the important data on your system changes. On a home system, most files never change, a few change daily, and some change weekly. No elaborate strategy needs to be created to deal with that. A good strategy for home use is to back up (to any kind of removable media) critical data frequently and back up configuration and other files weekly.

At the enterprise level on a larger system with multiple users, a different approach is called for. Some critical data is changing constantly, and it could be expensive to re-create; this typically involves elaborate and expensive solutions. Most of us exist somewhere in between these extremes. Assess your system and its use to determine where you fall in this spectrum.

Backup schemes and hardware can be elaborate or simple, but they all require a workable plan and faithful follow-through. Even the best backup plan is useless if the process is not carried out, data is not verified, and data restoration is not practiced on a regular basis. Whatever backup scheme you choose, be sure to incorporate in it these three principles:

Image Have a plan—Design a plan that is right for your needs and have equipment appropriate to the task. This involves assessing all the factors that affect the data you are backing up. We delve into more detail later in the chapter.

Image Follow the plan—Faithfully complete each part of your backup strategy and then verify the data stored in the backups. Backups with corrupt data are of no use to anyone. Even backup operations can go wrong.

Image Practice your skills—Practice restoring data from your backup systems from time to time, so when disaster strikes, you are ready (and able) to benefit from the strength of your backup plan. (For restoring data, see the section “Using Backup Software.”) Keep in mind that it is entirely possible that the flaws in your backup plan will become apparent only when you try restoring.


Sound Practices

You have to create your own best backup plan, but here are some building blocks that go into the foundation of any sound backup program:

Image Maintain more than one copy of critical data.

Image Label the backups.

Image Store the backups in a climate-controlled and secure area.

Image Use secure, offsite storage of critical data. Many companies choose bank vaults for their offsite storage, and this is highly recommended.

Image Establish a backup policy that makes sense and can be followed religiously. Try to back up your data when the system is consistent (that is, no data is being written), which is usually overnight.

Image Keep track of who has access to your backup media and keep the total number of people as low as possible. If you can, allow only trusted personnel near your backups.

Image Routinely verify backups and practice restoring data from them.

Image Routinely inspect backup media for defects and regularly replace them (after destroying the data on them if it is sensitive).


Evaluating Backup Strategies

Now that you are convinced you need backups, you need a strategy. It is difficult to be specific about an ideal strategy because each user or administrator’s strategy will be highly individualized, but here are a few general examples:

Image Home user—At home, the user has the Ubuntu installation DVD that takes less than an hour to reinstall, so the time issue is not a major concern. The home user will want to back up any configuration files that have altered, keep an archive of any files that have been downloaded, and keep an archive of any data files created while using any applications. Unless the home user has a special project in which constant backups are useful, a weekly backup is probably adequate. The home user will likely use a consumer-focused online cloud service like Dropbox, an external hard drive, or other removable media for backups.

Image Small office—Many small offices tend to use the same strategy as the home user but are more likely to back up critical data daily and use manually changed tape drives. If they have a tape drive with adequate storage, they will likely have a full system backup as well because restoring from the tape is quicker than reinstalling from the CDs. They also might be using a CD-RW or DVD writers for backups. Although they will use scripts to automate backups, most of it is probably done by hand. This category is also moving to online cloud services for backup as technology is becoming more mature and less expensive.

Image Small enterprise—Here is where backups begin to require higher-end equipment such as auto-loading tape drives with fully automated backups. Commercial backup software usually makes an introduction at this level, but a skillful system administrator on a budget can use one of the basic applications discussed in this chapter. Backups are highly structured and supervised by a dedicated system administrator. You might have guessed that small enterprises are also moving their backups to online cloud services.

Image Large enterprise—These are the most likely setting for the use of expensive, proprietary, and highly automated backup solutions. At this level, data means money, lost data means lost money, and delays in restoring data means money lost as well. These system administrators know that backups are necessary insurance and plan accordingly. Often, these own their own online, distributed cloud systems, with multiple redundant data centers in geographically diverse locations.

Does all this mean that enterprise-level backups are better than those done by a home user? Not at all. The “little guy”, with Ubuntu, can do just as well as the enterprise operation at the expense of investing more time in the process. By examining the higher-end strategies, we can apply useful concepts across the board.

This chapter focuses on the local-level activities, not the cloud-services activities that are based on techniques like those listed here, but combined with networking and cloud-service specific additional details. It also discusses some technologies that are a bit outdated for the enterprise but might be useful to the hobbyist with cheap and easy access to older equipment. If you want to use an online cloud service, take what you learn here, read everything made available by your cloud service provider, and then do your homework to design a suitable backup solution for your unique needs. This could be as simple as putting all your important files in a Dropbox-style cloud folder that automatically updates to another computer you own—if you are a casual consumer-grade user backing up simple documents and a few legally owned media files, remembering that services like these generally do not guarantee your data to be permanently backed up, especially the free versions, so although we’ve not had problems, we warn that they are not enterprise backup solutions. Or it could be as detailed as studying up on Amazon Web Services, OpenStack, or other cloud providers and learning the fine details of their services to suit your needs.


Note

If you are a new system administrator, you might inherit an existing backup strategy. Take some time to examine it and see if it meets the current needs of the organization. Think about what backup protection your organization really needs and determine if the current strategy meets that need. If it does not, change the strategy. Consider whether the current policy is being practiced by the users, and, if not, why it is not.



Backup Levels

UNIX uses the concept of backup levels as a shorthand way of referring to how much data is backed up in relation to a previous backup. It works this way:

A level 0 backup is a full backup. The next backup level is 1.

Backups at the other numbered levels will back up everything that has changed since the last backup at that level or a numerically higher level. (The dump command, for example, offers 10 different backup levels.) For example, a level 3 followed by a level 4 generates an incremental backup from the full backup, and a level 4 followed by a level 3 generates a differential backup between the two.


The following sections examine a few of the many strategies in use today. Many strategies are based on these sample schemes; one of them can serve as a foundation for the strategy you construct for your own system.

Simple Strategy

If you need to back up just a few configuration files and some small data files, copy them to a USB stick, label it, and keep it someplace safe. You can copy the important files to a CD-RW disk (up to 700MB in size), or a DVD-RW disk (up to 8GB for data) if you still have that hardware. Most users have switched to using an external hard drive for backups because they are becoming less and less expensive and hold a great amount of data, or they have moved backups online.

In addition to configuration and data files, you should archive each user’s /home directory and their entire /etc directory. Between the two, that backup would contain most of the important files for a small system. Then you can easily restore this data from the backup media device you have chosen after a complete reinstall of Ubuntu, if necessary.

Experts used to say that if you have more data than can fit on a floppy disk, you really need a formal backup strategy. Because a floppy disk only held a little over 1MB (and is now incredibly obsolete), perhaps we should change that to “if you have more data than can fit on a cheap USB stick.” In any case, some formal backup strategies are discussed in the following sections. We use a tape media backup as an example for convenience because they have been incredibly popular and standard in the past and are still quite common, even as people move to new options such as portable hard drives and cloud storage.

Full Backup on a Periodic Basis

This backup strategy involves a backup of the complete file system on a weekly, bi-weekly, or other periodic basis. The frequency of the backup depends on the amount of data being backed up, the frequency of changes to the data, and the cost of losing those changes.

This backup strategy is not complicated to perform, and it can be accomplished with the swappable disk drives discussed later in the chapter. If you are connected to a network, it is possible to mirror the data on another machine (preferably offsite); the rsync tool is particularly well suited to this task. Recognize that this does not address the need for archives of the recent state of files; it only presents a snapshot of the system at the time the update is done.

Full Backups with Incremental Backups

This scheme involves performing a full backup of the entire system once a week, along with a daily incremental backup of only those files that have changed in the previous day, and it begins to resemble what a system administrator of a medium-to-large system traditionally uses.

This backup scheme can be advanced in two ways. In one way, each incremental backup can be made with reference to the original full backup. In other words, a level 0 backup is followed by a series of level 1 backups. The benefit of this backup scheme is that a restoration requires only two tapes (the full backup and the most recent incremental backup). But because it references the full backup, each incremental backup might be large (and grow ever larger) on a heavily used system.

Alternatively, each incremental backup could reference the previous incremental backup. This is a level 0 backup followed by a level 1, followed by a level 2, and so on. Incremental backups are quicker (less data each time) but require every tape to restore a full system. Again, it is a classic trade-off decision.

Modern commercial backup applications such as Amanda or BRU assist in organizing the process of managing complex backup schedules and tracking backup media. Doing it yourself using the classic dump or employing shell scripts to run tar requires that system administrators handle all the organization themselves. For this reason, complex backup situations are typically handled with commercial software and specialized hardware that are packaged, sold, and supported by vendors.

Mirroring Data or RAID Arrays

Given adequate (and often expensive) hardware resources, you can always mirror the data somewhere else, essentially maintaining a real-time copy of your data on hand. This is often a cheap, workable solution if no large amounts of data are involved. The use of redundant array of independent disks (RAID) arrays (in some of their incarnations) provides for a recovery if a disk fails.

Note that RAID arrays and mirroring systems just as happily write corrupt data as valid data. Moreover, if a file is deleted, a RAID array will not save it. RAID arrays are best suited for protecting the current state of a running system, not for backup needs.

Making the Choice

Only you can decide what is best for your situation. After reading about the backup options in this book, put together some sample backup plans; run through a few likely scenarios and assess the effectiveness of your choice.

In addition to all the other information you have learned about backup strategies, here are a couple good rules of thumb to remember when making your choice:

Image If the backup strategy and policy is too complicated (and this holds true for most security issues), it will eventually be disregarded and fall into disuse.

Image The best scheme is often a combination of strategies; use what works.

Choosing Backup Hardware and Media

Any device that can store data can be used to back it up, but that is like saying that anything with wheels can take you on a cross-country trip. Trying to fit 10GB worth of data on a big stack of CD-RWs is an exercise in frustration, and using an expensive automated tape device to save a single copy of an email is a waste of resources.

Many people use what hardware they already have for their backup operations. Many consumer-grade workstations have a DVD-RW drive, but they usually do not have the abundant free disk space necessary for performing and storing multiple full backups.

In this section, you find out about some of the most common backup hardware available and how to evaluate its appropriateness for your backup needs. With large storage devices becoming increasingly affordable (you can now get 2TB hard drives for around $100), decisions about backup hardware for the small business and home users have become more interesting.

Removable Storage Media

Choosing the right media isn’t as easy as it used to be when floppy drives were the only choice. Today, most machines have DVD-ROM drives that can read but not write DVDs, which rules them out for backup purposes. Instead, USB hard drives and solid-state “pen” drives have taken over the niche previously held by floppy drives. Both USB hard drives and solid-state drives are highly portable. Support for these drives under Ubuntu is very good, and the storage space is rising while prices are continuing to fall. A 1TB USB external hard drive is also within the budgets of most and are preferred for anything more than could be held on one pen drive. The biggest benefits of USB drives are data transfer speed and portability.

CD-RW and DVD+RW/-RW Drives

Compared to floppy drives and some removable drives, CD-RW drives and their cousins, DVD+RW/-RW drives, can store large amounts of data and are useful for a home or small business. CD writers and media that were once very common are cheap if you can find them but are gradually disappearing, and automated CD changing machines, necessary for automatically backing up large amounts of data, are still quite costly, if you can find them at all. A benefit of CD and DVD storage over tape devices is that the archived uncompressed file system can be mounted and its files accessed randomly just like a hard drive (you do this when you create a data CD, see Chapter 6, “Multimedia Applications”), making the recovery of individual files easier.

Each CD-RW disk can hold 650MB to 700MB of data (the media comes in both capacities at roughly the same cost); larger chunks of data can be split to fit on multiple disks. Some backup programs support this method of storage. After they are burned and verified, the shelf life for the media is at least a decade or longer.

DVD+RW/-RW is similar to CD-RW, but it is more expensive and can store up to 8GB of uncompressed data per disk.

Honestly, though, these are an old technology, and although they haven’t completely died off, the use of either CDs or DVDs for backup has dropped off considerably. It won’t be long before they become almost as rare as floppy disks and drives.

Network Storage

For network backup storage, remote arrays of hard drives provide one solution to data storage. With the declining cost of mass storage devices and the increasing need for larger storage space, network storage (NAS or network-attached storage) is available and supported in Linux. These are cabinets full of hard drives and their associated controlling circuitry, as well as special software to manage all of it. These NAS systems are connected to the network and act as a huge (and expensive) mass storage device.

More modest and simple network storage can be done on a remote desktop-style machine that has adequate storage space (up to eight 1TB drives is a lot of storage space, easily accomplished with off-the-shelf parts), but then that machine (and the local system administrator) has to deal with all the problems of backing up, preserving, and restoring its own data, doesn’t it? Several hardware vendors offer such products in varying sizes.

Tape Drive Backup

Tape drives have been used in the computer industry from the beginning. Tape drive storage has been so prevalent in the industry that the tar command (the most commonly used command for archiving) is derived from the words tape archive. Capacities and durability of tapes vary from type to type and range from a few gigabytes to hundreds of gigabytes with commensurate increases in cost for the equipment and media. Autoloading tape-drive systems can accommodate archives that exceed the capacity of the file systems.


Tip

Older tape equipment is often available in the used equipment market and might be useful for smaller operations that have outgrown more limited backup device options.


Tape equipment is well supported in Linux and, when properly maintained, is extremely reliable. The tapes themselves are inexpensive, given their storage capacity and the ability to reuse them. Be aware, however, that tapes do deteriorate over time and, being mechanical, tape drives can and will fail.


Caution

Neglecting to clean, align, and maintain tape drives puts your data at risk. The tapes themselves are also susceptible to mechanical wear and degradation. Hardware maintenance is part of a good backup policy. Do not ever forget that it is a question of when—not if—hardware will fail.


Cloud Storage

Services such as Amazon’s AWS and S3 or Dropbox offer a way to create and store backups offsite. Larger companies may create their own offsite, online storage options as well. In each of these and similar cases, data is copied and stored remotely on a file server set aside specifically for that purpose. The data backups may be scheduled with great flexibility and according to the plans and desires of the customer.

Cloud storage is a backup solution that is recent and growing in popularity, but it is also a technology that is changing rapidly. To learn more about the options mentioned here, take a look at www.dropbox.com/ and http://aws.amazon.com/s3/. Although these are not the only services of the kind available, they offer a good introduction to the concept. If you like to roll your own, you definitely want to take a look at Ubuntu Enterprise Cloud at www.ubuntu.com/cloud.

Using Backup Software

Because there are thousands of unique situations requiring as many unique backup solutions, it comes as no surprise that Linux offers many backup tools. Along with command-line tools such as tar and dd, Ubuntu also provides a graphical archiving tool for desktop installations called Déjà Dup that is quite powerful. Another excellent, but complicated alternative is the Amanda backup application—a sophisticated backup application that works well over network connections and can be configured to automatically back up all the computers on your network. Amanda works with drives as well as tapes.


Note

The software in a backup system must support the hardware, and this relationship can determine which hardware or software choices you make. Many system administrators choose particular backup software not because they prefer it to other options, but because it supports the hardware they own.

The price seems right for free backup tools, but consider the software’s ease of use and automation when assessing costs. If you must spend several hours implementing, debugging, documenting, and otherwise dealing with overly elaborate automation scripts, the real costs go up.


tar: The Most Basic Backup Tool

The tar tool, the bewhiskered old man of archiving utilities, is installed by default. It is an excellent tool for saving entire directories full of files. For example, here is the command used to back up the /etc directory:

Click here to view code image

matthew@seymour:~$ sudo tar cvf etc.tar /etc

Here, the options use tar to create an archive, be verbose in the message output, and use the filename etc.tar as the archive name for the contents of the directory /etc.

Alternatively, if the output of tar is sent to the standard output and redirected to a file, the command appears as follows:

Click here to view code image

matthew@seymour:~$ sudo tar cv /etc > etc.tar

The result is the same.

All files in the /etc directory will be saved to a file named etc.tar. With an impressive array of options (see the man page), tar is quite flexible and powerful in combination with shell scripts. With the -z option, it can even create and restore gzip compressed archives, and the -j option works with bzipped files.

Creating Full and Incremental Backups with tar

If you want to create a full backup, the following creates a bzip2 compressed tarball (the j option) of the entire system:

Click here to view code image

matthew@seymour:~$ sudo tar cjvf fullbackup.tar.bz2 /

To perform an incremental backup, you must locate all the files that have been changed since the last backup. For simplicity, assume that you do incremental backups on a daily basis. To locate the files, use the find command:

Click here to view code image

matthew@seymour:~$ sudo find / -newer name_of_last_backup_file ! -a -type f -print

When run alone, find generates a list of files system-wide and prints it to the screen. The ! -a -type eliminates everything but regular files from the list; otherwise, the entire directory is sent to tar even if the contents were not all changed.

Pipe the output of our find command to tar as follows:

Click here to view code image

matthew@seymour:~$ sudo find / -newer name_of_last_backup_file ! -type d -print |\
tar czT - backup_file_name_or_device_name

Here, the T - option gets the filenames from a buffer (where the - is the shorthand name for the buffer).


Note

The tar command can back up to a raw device (one with no file system) and to a formatted partition. For example

Click here to view code image

matthew@seymour:~$ sudo tar cvzf /dev/hdd /boot /etc /home

backs up those directories to device /dev/hdd (not /dev/hda1, but to the unformatted device itself).

The tar command can also back up over multiple floppy disks:

Click here to view code image

matthew@seymour:~$ sudo tar czvMf /dev/fd0 /home

This backs up the contents of /home and spreads the file out over multiple floppies, prompting you with this message:

Click here to view code image

Prepare volume #2 for '/dev/fd0' and hit return:


Restoring Files from an Archive with tar

The xp option in tar restores the files from a backup and preserves the file attributes, as well, and tar creates any subdirectories it needs. Be careful when using this option because the backups might have been created with either relative or absolute paths. You should use the tvf option with tar to list the files in the archive before extracting them so that you know where they will be placed.

For example, to restore a tar archive compressed with bzip2, use the following:

Click here to view code image

matthew@seymour:~$ sudo tar xjvf ubuntutest.tar.bz2

To list the contents of a tar archive compressed with bzip2, use this:

Click here to view code image

matthew@seymour:~$ sudo tar tjvf ubuntutest.tar.bz2
tar: Record size = 8 blocks

drwxr-xr-x matthew/matthew 0 2013-07-08 14:58 other/

-rwxr-xr-x matthew/matthew 1856 2013-04-29 14:37 other/matthew helmke public.asc

-rwxr-xr-x matthew/matthew 170 2013-05-28 18:11 backup.sh

-rwxr-xr-x matthew/matthew 1593 2013-10-11 10:38 backup method

Note that because the pathnames do not start with a backslash, they are relative pathnames and will install in your current working directory. If they were absolute pathnames, they would install exactly where the paths state.

The GNOME File Roller

The GNOME desktop file archiving graphical application File Roller (file-roller) views, extracts, and creates archive files using tar, gzip, bzip, compress, zip, rar, lha, and several other compression formats. Note that File Roller is only a front end to the command-line utilities that actually provide these compression formats; if a format is not installed, File Roller cannot use that format.


Caution

File Roller does not complain if you select a compression format that is not supported by installed software until after you attempt to create the archive. So, install any needed compression utilities first.


File Roller is well-integrated with the GNOME desktop environment to provide convenient drag-and-drop functionality with the Nautilus file manager. To create a new archive, select Archive, New to open the New Archive dialog box and navigate to the directory where you want the archive to be kept. Type your archive’s name in the Selection: /root text box at the bottom of the New Archive dialog box. Use the Archive type drop-down menu to select a compression method. Now, drag the files that you want to be included from Nautilus into the empty space of the File Roller window, and the animated icons will show that files are being included in the new archive. When you have finished, a list of files appears in the previously blank File Roller window. To save the archive, select Archive, Close. Opening an archive is as easy as using the Archive, Open dialog to select the appropriate archive file. You can learn more at https://help.ubuntu.com/community/File%20Roller.

Ubuntu also offers the KDE ark and kdat GUI tools for backups; they are installed only if you select the KDE desktop during installation, but you can search through Synaptic to find them. Archiving has traditionally been a function of the system administrator and not seen as a task for the individual user, so no elaborate GUI was believed necessary. Backing up has also been seen as a script-driven, automated task in which a GUI is not as useful. Although that’s true for system administrators, home users usually want something a little more attractive and easier to use, and that’s the gap ark was created to fill.

The KDE ark Archiving Tool

You launch ark by launching it from the command line. It is integrated with the KDE desktop (such as File Roller is with GNOME), so it might be a better choice if you use KDE. This application provides a graphical interface for viewing, creating, adding to, and extracting from archived files. Several configuration options are available with ark to ensure its compatibility with Microsoft Windows. You can drag and drop from the KDE desktop or Konqueror file browser to add or extract files, or you can use the ark menus.

As long as the associated command-line programs are installed, ark can work with tar, gzip, bzip2, zip, and lha files (the last four being compression methods used to save space by compaction of the archived files).

Existing archives are opened after launching the application itself. You can add files and directories to the archive or delete them from the archive. After opening the archive, you can extract all of its contents or individual files. You can also perform searches using patterns (all *.jpg files, for example) to select files.

Choosing New from the File menu creates new archives. You then type the name of the archive, providing the appropriate extension (.tar, .gz, and so on), and then proceed to add files and directories as you desire.

Déjà Dup

Déjà Dup is a simple backup tool with a useful GUI. It supports local, remote, or cloud backups. It can encrypt and compress your data for secure and fast transfers and more. Search the Dash to find Déjà Dup (Figure 17.1).

Image

FIGURE 17.1 The Déjà Dup icon is easy to find in the Dash.

Start with Déjà Dup Backup Preferences. This menu item brings up a configuration wizard that lets you set where the backup will be stored, what will be backed up, a schedule for automatic backups, and more. The following screenshots tell the story (see Figures 17.2-17.4).

Image

FIGURE 17.2 Storage options include cloud storage, local, via FTP or SSH and more.

Image

FIGURE 17.3 Set the files and folders you want to back up or exclude.

Image

FIGURE 17.4 Set a schedule for automatic backups.

Click Close when you are done with configuration. When you are ready to run Déjà Dup at a later time, click the same icon in the Dash. This brings up the same interface showing the Overview tab. Click Restore to restore from a previous backup. Click Back Up Now to back up using the settings enabled earlier (see Figure 17.5).

Image

FIGURE 17.5 Backing up and restoring is easy from the Déjà Dup Overview.

Back In Time

Back In Time is a viable alternative to Déjà Dup for many users. It is easily available from the Ubuntu Software Center, stable, has a clear and easy-to-understand interface, and is actually little more than a GUI front end for well-established tools.

Back In Time uses rsync, diff, and cp to monitor, create, and manipulate files, and it uses cron to schedule when it will run. Using these command-line tools is described later in this chapter. Back In Time is little more than a well-designed GUI front end designed for GNOME that also works well with Ubuntu’s Unity interface. Back In Time also offers a separate package in the Ubuntu Software Center with a front end for KDE, if that is your preference. If you use the standard Ubuntu interface, install a package from the Ubuntu Software Center called nautilus-actions to get context menu access to some of the backup features.

The first time you run Back In Time, it takes a snapshot of your drive. This takes a long time, depending on the amount of data you have. You designate which files and directories to back up and where to back them up. Then set when to schedule the backup. The program takes care of the rest.

To restore, select the most recent snapshot from the list at the left of the screen (see Figure 17.6). Then browse through the list of directories and files at the right until you find the file that interests you. You may right-click the file to view a pop-up menu, from which you may open a file, copy a file to a desired location, or view the various snapshots of a file and compare them to determine which you might want to restore.

Image

FIGURE 17.6 Back In Time makes finding and restoring files easy.

Back In Time keeps multiple logs of actions and activities, file changes and versions, and is a useful tool. It does have one main weakness: For the moment, it remains unable to schedule backups to be made over a network. If you are backing up to a second hard drive or if you always have a network drive mounted on your system, this is not an issue. However, if you want to back up to cloud storage or via ssh, this tool might not suit your needs.

You can find the official documentation for Back In Time at http://backintime.le-web.org.

Unison

Unison is a file-synchronization tool that works on multiple platforms, including Linux, other flavors of UNIX such as Solaris and Mac OS X, and Windows. After Unison is set up, it synchronizes files in both directions and across platforms. If changes are made on both ends, files are updated in both directions. When file conflicts arise, such as when the same file was modified on each system, the user is prompted to decide what to do. Unison can connect across a network using many protocols, including ssh. It can connect with and synchronize many systems at the same time and even to the cloud.

Unison was a project developed at the University of Pennsylvania as a research project among several academics. It is no longer under active development as a research project, but it does appear to continue to be maintained with bug fixes and very occasional feature additions. The original developers claim to still be using it daily, so it is not completely abandoned.

Unison is powerful and configurable. The foundation is based on rsync, but with some additions that enable functionality that is generally only available from a version control system. See Chapter 39, “Opportunistic Development,” for an introduction to these, as well as a brief mention later in this chapter in the context of backing up configuration files.

Even though the project is no longer the primary focus of any of the developers, many people still use Unison, and it still gets press, as in a recent Linux Journal article. For that reason, it gets a mention in this chapter and might be worthy of your time and effort if you are interested. Unison is released under the free GPL license, so you might decide you want to dig in to the code. The developers do not have time to maintain it regularly but welcome patches and contributions. If this sounds like a project that interests you, see www.cis.upenn.edu/~bcpierce/unison/.

Using the Amanda Backup Application

Amanda is a powerful network backup application created by the University of Maryland at College Park. Amanda is a robust backup and restore application best suited to unattended backups with an autoloading tape drive of adequate capacity. It benefits from good user support and documentation.

Amanda’s features include compression and encryption. It is intended for use with high-capacity tape drives, floptical, CD-R, and CD-RW devices.

Amanda uses GNU tar and dump; it is intended for unattended, automated tape backups and is not well suited for interactive or ad hoc backups. The support for tape devices in Amanda is robust, and file restoration is relatively simple. Although Amanda does not support older Macintosh clients, it uses Samba to back up Microsoft Windows clients, as well as any UNIX client that can use GNU tools (which includes Mac OS X). Because Amanda runs on top of standard GNU tools, file restoration can be made using those tools on a recovery disk even if the Amanda server is not available. File compression can be done on either the client or server, thus lightening the computational load on less-powerful machines that need to be backed up.


Caution

Amanda does not support dump images larger than a single tape and requires a new tape for each run. If you forget to change a tape, Amanda continues to attempt backups until you insert a new tape, but those backups will not capture the data as you intended them to. Do not use too small a tape or forget to change a tape; otherwise, you will not be happy with the results.


There is no GUI for Amanda. Configuration is done in the time-honored UNIX tradition of editing text configuration files located in /etc/amanda. The default installation in Ubuntu includes a sample cron file because it is expected that you will be using cron to run Amanda regularly. The client utilities are installed with the package amanda-client; the Amanda server is called amanda-server. Install both. As far as backup schemes are concerned, Amanda calculates an optimal scheme on-the-fly and schedules it accordingly. It can be forced to adhere to a traditional scheme, but other tools are possibly better suited for that job.

The man page for Amanda (the client is amdump) is well written and useful, explaining both the configuration of Amanda and detailing the several programs that actually make up Amanda. The configuration files found in /etc/amanda are well commented; they provide a number of examples to assist you in configuration.

The program’s home page is www.amanda.org. There you can find information about subscribing to the mail list and links to Amanda-related projects and a FAQ.

Alternative Backup Software

Commercial and other freeware backup products do exist; BRU and Veritas are good examples of effective commercial backup products. Here are some useful free software backup tools that are not installed with Ubuntu:

Image flexbackup—This backup tool is a large file of Perl scripts that makes dump and restore easier to use. flexbackup’s command syntax can be found by using the command with the -help argument. It also can use afio, cpio, and tar to create and restore archives locally or over a network using rsh or ssh if security is a concern. Its home page is www.edwinh.org/flexbackup/. Note that it has not received any updates or changes in a very long time.

Image afio—This tool creates cpio formatted archives but handles input data corruption better than cpio (which does not handle data input corruption very well at all). It supports multi-volume archives during interactive operation and can make compressed archives. If you feel the need to use cpio, you might want to check out afio at http://freshmeat.net/projects/afio/.

Many other alternative backup tools exist, but covering all of them is beyond the scope of this book. Two good places to look for free backup software are Freshmeat (www.freshmeat.net) and Google (www.google.com/linux).

Copying Files

Often, when you have only a few files that you need to protect from loss or corruption, it might make better sense to simply copy the individual files to another storage medium rather than to create an archive of them. You can use the tar, cp, rsync, or even the cpio commands to do this; you can also use a handy file management tool known as mc. Using tar is the traditional choice because older versions of cp did not handle symbolic links and permissions well at times, causing those attributes (characteristics of the file) to be lost; tar handled those file attributes in a better manner. cp has been improved to fix those problems, but tar is still more widely used. rsync has recently been added to Ubuntu and is an excellent choice for mirroring sets of files, especially when done over a network.

To illustrate how to use file copying as a backup technique, the examples here show how to copy (not archive) a directory tree. This tree includes symbolic links and files that have special file permissions we need to keep intact.

Copying Files Using tar

One choice for copying files into another location is to use the tar command; you just create a tar file that is piped to tar to be uncompressed in the new location. To accomplish this, first change to the source directory. Then the entire command resembles this:

Click here to view code image

matthew@seymour:~$ tar -cvf files | (cd target_directory ; tar -xpf)

In this command, files are the filenames you want to include; use * to include the entire current directory.

Here is how this command works: You have already changed to the source directory and executed tar with the cvf arguments that tell tar to

Image c—Create an archive.

Image v—Verbose; lists the files processed so you can see that it is working.

Image f—The filename of the archive will be what follows. (In this case, it is -.)

The following tar commands can be useful for creating file copies for backup purposes:

Image l—Stay in the local file system (so that you do not include remote volumes).

Image atime-preserve—Do not change access times on files, even though you are accessing them now (to preserve the old access information for archival purposes).

The contents of the tar file (held for us temporarily in the buffer, which is named -) are then piped to the second expression, which extracts the files to the target directory. In shell programming (refer to Chapter 14, “Automating Tasks and Shell Scripting”), enclosing an expression in parentheses causes it to operate in a subshell and be executed first.

First you change to the target directory, and then

Image x—Extract files from a tar archive.

Image p—Preserve permissions.

Image f—The filename will be -, the temporary buffer that holds the files archived with tar.

Compressing, Encrypting, and Sending tar Streams

The file copy techniques using the tar command in the previous section can also be used to quickly and securely copy a directory structure across a LAN or the Internet (using the ssh command). One way to make use of these techniques is to use the following command line to first compress the contents of a designated directory and then decompress the compressed and encrypted archive stream into a designated directory on a remote host:

Click here to view code image

matthew@seymour:~$ tar -cvzf data_folder | ssh remote_host '( cd ~/mybackup_dir; tar -xvzf )'

The tar command is used to create, list, and compress the files in the directory named data_folder. The output is piped through the ssh (Secure Shell) command and sent to the remote computer named remote_host. On the remote computer, the stream is then extracted and saved in the directory named /mybackup_dir. You are prompted for a password to send the stream.

Copying Files Using cp

To copy files, we could use the cp command. The general format of the command when used for simple copying is as follows:

Click here to view code image

matthew@seymour:~$ cp -a source_directory target_directory

The -a argument is the same as giving -dpR, which would be

Image -d—Preserves symbolic links (by not dereferencing them) and copies the files that they point to instead of copying the links.

Image -p—Preserves all file attributes if possible. (File ownership might interfere.)

Image -R—Copies directories recursively.

The cp command can also be used to quickly replicate directories and retain permissions by using the -avR command-line options. Using these options preserves file and directory permissions, gives verbose output, and recursively copies and re-creates subdirectories. You can also create a log of the backup during the backup by redirecting the standard output like this:

Click here to view code image

matthew@seymour:~$ sudo cp -avR directory_to_backup destination_vol_or_dir 1 > /root/backup_log.txt

or

Click here to view code image

matthew@seymour:~$ sudo cp -avR ubuntu /test2 1 > /root/backup_log.txt

This example makes an exact copy of the directory named /ubuntu on the volume named /test2 and saves a backup report named backup_log.txt under /root.

Copying Files Using mc

The Midnight Commander (available in the Universe repository, under the package mc; see Chapter 9, “Managing Software,” for how to enable the Universe and Multiverse repositories) is a command-line file manager that is useful for copying, moving, and archiving files and directories. The Midnight Commander has a look and feel similar to the Norton Commander of DOS fame. By executing mc at a shell prompt, a dual-pane view of the files is displayed. It contains drop-down menu choices and function keys to manipulate files. It also uses its own virtual file system, enabling it to mount FTP directories and display the contents of tar files, gzip tar files (.tar.gz or .tgz), bzipfiles, DEB files, and RPM files, as well as extract individual files from them. As if that is not enough, mc contains a File Undelete virtual file system for ext2/3 partitions. By using cd to “change directories” to an FTP server’s URL, you can transfer files using FTP. The default font chosen for Ubuntu makes the display of mc ugly when used in a tty console (as opposed to an xterm), but does not affect its performance.

In the interface, pressing the F9 key drops down the menu, and pressing F1 displays the Help file. A “feature” in the default GNOME terminal intercepts the F10 key used to exit mc, so use F9 instead to access the menu item to quit, or just click the menu bar at the bottom with your mouse. The configuration files are well documented, and it would appear easy to extend the functionality of mc for your system if you understand shell scripting and regular expressions. It is an excellent choice for file management on servers not running X.

Using rsync

An old favorite for backing up is rsync. One big reason for this is because rsync enables you to copy only those files that have changed since the last backup. So although the initial backup might take a long time, subsequent backups are much faster. It is also highly configurable and can be used with removable media such as USB hard drives or over a network. Here is one way to use rsync.

First, create an empty file and call it backup.sh:

Click here to view code image

matthew@seymour:~$ sudo touch backup.sh

Then, using your favorite text editor, enter the following command into the file and save it:rsync --force --ignore-errors --delete --delete-excluded --exclude-

Click here to view code image

from=/home/matthew-exclude.txt --backup --backup-dir='date +%Y-%m-%d' -av /
/media/externaldrive/backup/Seymour

Make the file executable:

Click here to view code image

matthew@seymour:~$ sudo chmod +x backup.sh

This command uses several options with rsync and puts them in a script that is quick and easy to remember and run. You can run the script at the command line using sudo sh ./backup.sh or as an automated cron job.

Here is a rundown of what is going on in the command. Basically, rsync is told to copy all new and changed files (what to back up) and delete from any existing backup any files that have been deleted on the source (and back them up in a special directory, just to be safe). It is told where to place the backup copy and is given details on how to deal with specific issues in the process. Read the rsync man page for more options and to customize to your needs.

Following are the options used here:

Image --force—Forces deletion of directories in the target location that are deleted in the source, even if the directories in the destination are not empty.

Image --ignore-errors—Tells --delete to go ahead and delete files even when there are I/O errors.

Image --delete—Deletes extraneous files from destination directories.

Image --delete-excluded—Also deletes excluded files from destination directories.

Image --exclude-from=/home/matt-exclude.txt—Prevents backing up files or directories listed in this file. (It is a simple list with each excluded directory on its own line.)

Image --backup—Creates backups of files before deleting them from a currently existing backup.

Image --backup-dir='date +%Y-%m-%d'—Creates a backup directory for the previously mentioned files that looks like this: 2013-07-08. Why this format for the date? Because it is standard, as outlined in ISO 8601 (see: http://www.iso.org/iso/home/standards/iso8601.htm). It is clear, works with scripts, and sorts beautifully, making your files easy to find.

Image -av—Tells rsync to use archive mode and verbose mode.

Image /—Denotes the directory to back up. In this case, it is the root directory of the source, so everything in the file system is being backed up. You could put /home here to back up all user directories or make a nice list of directories to exclude in the file system.

Image /media/externaldrive/backup/seymour—Sets the destination for the backup as the /backup/seymour directory on an external hard drive mounted at /mount/externaldrive.

To restore from this backup to the same original location, you reverse some of the details and may omit others. Something like this works nicely:

Click here to view code image

matthew@seymour:~$ rsync --force --ignore-errors --delete --delete-excluded
/media/externaldrive/backup/seymour /

This becomes even more useful when you think of ways to script its use. You could create an entry in crontab, as described in Chapter 14, “Automating Tasks and Shell Scripting.” Even better, you could set two computers to allow for remote SSH connections using private keys created with ssh-keygen, as described in Chapter 19, “Remote Access with SSH, Telnet, and VNC,” so that one could back up the files from one computer to the other computer without needing to login manually. Then you could place that in an automated script.

Version Control for Configuration Files

For safety and ease of recovery when configuration files are corrupted or incorrectly edited, the use of a version control system is recommended. In fact, this is considered an industry best practice. Many top-quality version control systems are available, such as Git, Subversion, Mercurial, and Bazaar. If you already have a favorite, perhaps one that you use for code projects, you can do what we describe in this section using that version control system. The suggestions here are to get people thinking about the idea of using version control for configuration files and to introduce a few well-used and documented options for those who are unfamiliar with version control. First, some background.

Version control systems are designed to make it easy to revert changes made to a file, even after the file has been saved. This is done a little bit differently by each system, but the basic idea is that not only is the current version of the file saved, but each and every version that existed previously is also saved. Some version control systems do this by saving the entire file every time. Some use metadata to describe just the differences between each version. In any case, it is possible to roll back to a previous version of the file, to restore a file to a state before changes were made. Developers who write software are well aware of the power and benefit to being able to do this quickly and easily; it is no longer required that the file editor remember the technical details of where, what, or even how a file has been edited. When a problem occurs, the file is simply restored to its previous state. The version control system is also able to inform the user where and how each file has changed at each save.

Using a version control system for configuration files means that every time a configuration is changed, those changes are recorded and tracked. This enables easy discovery of intruders (if a configuration has been changed by an unauthorized person trying to reset, say, the settings for Apache so that the intruder can allow a rogue web service or site to run on your server), easy recovery from errors and glitches, and easy discovery of new features or settings that have been enabled or included in the configuration by software upgrades.

Many older and well-known tools do this task, such as changetrack, which is quite a good example. All of them seek to make the job of tracking changes to configuration files easier and faster, but with the advances in version control systems, most provide very little extra benefit. Instead of suggesting any of these tools, you are probably better off learning a modern and good version control system. One exception is worth a bit of discussion because of its ability to work with your software package manager, which saves you the task of remembering to commit changes to your version control system each time the package manager runs. This exception is etckeeeper.

etckeeper takes all of your /etc directory and stores the configuration files from it in a version control system repository. You can configure the program by editing the etckeeper.conf file to store data in a Git, Mercurial, Bazaar, or Subversion repository. In addition, etckeeper connects automatically to the APT package management tool used by Ubuntu and automatically commits changes made to /etc and the files in it during normal software package upgrades. Other package managers, such as Yum, can also be tracked when using other Linux distributions such as Fedora. It even tracks file metadata that is often not easily tracked by version control systems, like the permissions in /etc/shadow.


Caution

Using any version control system to track files that contain sensitive data such as passwords can be a security risk. Tracked files and the version control system itself should be treated with the same level of care as the sensitive data itself.


By default, etckeeper uses Git. On Ubuntu, this is changed to Bazaar (bzr) because it is the version control system used by Ubuntu developers. Because this is configurable, we mention just the steps here and leave it to you to adapt them for your particular favorite version control system.

First, edit /etc/etckeeper/etckeeper.conf to use your desired settings, such as the version control system to use, the system package manager being used, and whether to have changes automatically committed daily. After etckeeper is installed from the Ubuntu repositories, it must be initiated from the command line:

Click here to view code image

matthew@seymour:~$ etckeeper init

If you are only going to use etckeeper to track changes made to /etc when software updates are made using APT, you do not need to do anything else. If you edit files by hand, make sure you use your version control system’s commands to commit those changes or use the following:

Click here to view code image

matthew@seymour:~$ etckeeper commit "Changed prompt style"

The message in quotes should reflect the change just made. This makes reading logs and finding exact changes much easier later.

Recovering or reverting file changes is then done using your version control system directly. Suppose, for example, that you have made a change in /etc/bash.bashrc, the file that sets the defaults for your bash shell. You read somewhere how to change the prompt and did not like the result. However, because the changes are being tracked, you can roll it back to the previous version. Because bzr is the default for etckeeper in Ubuntu, here is how you do that with bzr. First, check the log to find the commit number for the previous change:

Click here to view code image

matthew@seymour:~$ bzr log /etc/bash.bashrc
------------------------------------------------------------
revno: 2
committer: matthew <matthew@seymour>
branch nick: seymour etc repository
timestamp: Tue 2013-07-16 11:08:22 -0700
message:
Changed /etc/bash.bashrc
------------------------------------------------------------
revno: 1
committer: matthew <matthew@seymour>
branch nick: seymour etc repository
timestamp: Tue 2013-07-16 11:00:16 -0700
message:
Changed /etc/bash.bashrc
------------------------------------------------------------

I know the change was made in the most recent revision, denoted revno 2 (for revision number two), so I now revert back to that version:

Click here to view code image

matthew@seymour:~$ bzr revert -revision 2 /etc/bash.bashrc

Today it is common for programmers, systems administrators, and developer types to back up their dotfiles using version control. Dotfiles are the configuration files and directories in a user’s /home directory, all of which begin with a dot, like .bashrc. These are not necessarily backed up by all software, and because they are often customized by highly technical people to suit their desires, backing them up is a good idea. Version control systems are commonly used. A new program for Ubuntu called dotdee performs this task for a different type of configuration file or directory that ends with .d and is stored in /etc. You can find more information about dotdee in Chapter 9, “Managing Software.”

System Rescue

There will come a time when you need to engage in system rescue efforts. This need arises when the system will not even start Linux so that you can recover any files. This problem is most frequently associated with the boot loader program or partition table, but it could be that critical system files have been inadvertently deleted or corrupted. If you have been making backups properly, these kinds of system failures are easily, though not quickly, recoverable through a full restore. Still, valuable current data might not have been backed up since the last scheduled backup, and the backup archives are found to be corrupt, incomplete, or missing. A full restore also takes time you might not have. If the problem causing the system failure is simply a damaged boot loader, a damaged partition table, a missing library, or misconfiguration, a quick fix can get the system up and running, and the data can then be easily retrieved.

In this section, you learn a couple of quick things to try to restore a broken boot loader or recover your data when your system fails to boot.

The Ubuntu Rescue Disc

The Ubuntu installation DVD works quite well as a live DVD. To use it, insert the disc and reboot the computer, booting from the DVD just as you did when you installed Ubuntu originally and ran it from the DVD.

Restoring the GRUB2 Boot Loader

The easiest way to restore a broken system’s GRUB2 files is simply to replace them. This works with Ubuntu 9.10 or later only and only if the system uses GRUB2 (which it does if it was originally installed using 9.10 or later, but might not if Ubuntu was originally installed using an older Ubuntu release and upgraded as GRUB2 is not automatically installed during release upgrades). In any case, your best bet is to use a DVD from the same release as what you have installed on the hard drive.

To get started, boot using the live DVD and open a terminal from the Dash. Then determine which of the hard drive’s partitions holds the Ubuntu installation, which you can discover using the following:

Click here to view code image

matthew@seymour:~$ sudo fdisk -l

You may find this block ID command useful, as it tends to return a bit more information:

matthew@seymour:~$ sudo blkid

Unless you customized your installation, in which case you probably already know your partitioning scheme and the location of your Ubuntu installation, the partition will probably be on a drive called sda on the first partition, which you can mount now using this:

Click here to view code image

matthew@seymour:~$ sudo mount /dev/sda1 /mnt

This mounts the drive in the current file system (running from the live DVD) at /mnt, where it will be accessible to you for reading and modifying as needed. Next, you reinstall GRUB2 on this device:

Click here to view code image

matthew@seymour:~$ sudo grub-install --boot-directory=/mnt/boot /dev/sda

At this point, reboot (using your hard drive and not the live DVD), and all should be well. After the reboot is complete, enter the following:

Click here to view code image

matthew@seymour:~$ sudo update-grub

This refreshes the GRUB2 menu and completes the restoration. You can find a lot of great information about GRUB2 at https://help.ubuntu.com/community/Grub2.

Saving Files from a Nonbooting Hard Drive

If restoring the GRUB2 boot loader fails and you still cannot boot from the hard drive, try to use the live DVD to recover your data. Boot and mount the hard drive as shown previously and then attach an external storage device such as a USB thumb drive or an external hard drive. Then copy the files you want to save from the mounted drive to the external drive.

If you cannot mount the drive at all, your options become more limited and possibly more expensive. It is likely that either the hardware has failed or the file system has become badly corrupted. In either case, recovery is either impossible or more difficult and best left to experts if the data is important to you. But, the good news is that you have been making regular backups, right? So, you probably only lost a day or maybe a week of work and can buy a new drive, install it, and start from scratch, putting the data from your backup on your new Ubuntu installation on the new hardware.

Every experienced system administrator has had this happen because no hardware is infallible. We expect occasional hardware failures, and that’s why we have good backup and recovery schemes in place for data. There are two types of system administrators: those who lose data when this happens and those who have good schemes in place. Be forewarned and be wise.

If you did not have a backup, which happens to most system administrators only once in their lives (because they learn from the mistake), immediately stop messing with the hard drive. Your best bet to recover the data will be very expensive, but you should look for a company that specializes in the task and pay them to do it. If your data is not worth the expense for recovery and you want to try to recover it yourself, you can try, but this is not a task for the faint of heart, and more often than not, the data is simply lost. Again, the moral of the story is back up regularly, check your backups to be sure they are valid, and repeat. Practice restoring from backups before you need to do it, perhaps with a test system that is not vital and will not hurt anything if you make a mistake.

References

Image https://help.ubuntu.com/community/BackupYourSystem—An excellent place to start for learning and examining backup methods in Ubuntu.

Image www.tldp.org/—The Linux Documentation Project offers several useful HOWTO documents that discuss backups and disk recovery.