Ubuntu Unleashed 2017 Edition (2017)
Part III: System Administration
Chapter 21. Performance Tuning
In This Chapter
Hard Disk
Kernel
Apache
MySQL
References
Squeezing extra performance out of your hardware might sound like a pointless task given how cheap commodity upgrades are today. To a certain degree that is true; for most of us, it is cheaper to buy a new computer than to spend hours fighting to get a 5 percent speed boost. But what if the speed boost were 20 percent? How about if it were 50 percent?
The amount of benefit you can get by optimizing your system varies depending on what kinds of tasks you are running, but there is something for everyone. Over the next few pages we look at quick ways to optimize the Apache web server, both the KDE and GNOME desktop systems, both MySQL and PostgreSQL database servers, and more.
Before we start, you need to understand that optimization is not an absolute term: If we optimize a system, we have improved its performance, but it is still possible it could further be increased. We are not interested in getting 99.999 percent performance out of a system because optimization suffers from the law of diminishing returns—the basic changes make the biggest differences, but after that it takes increasing amounts of work to obtain decreasing speed improvements.
Hard Disk
Many Linux users love to tinker under the hood to increase the performance of their computers, and Linux gives you some great tools to do that. Whereas stability-loving nurturers generally tell us, “Don’t fix what’s not broken,” experiment-loving hot-rodders often say, “Fix it until it breaks.” In this section, you learn about many of the commands used to tune, or “tweak,” your file system.
Before you undertake any under-the-hood work with Linux, however, keep a few points in mind. First, perform a benchmark on your system before you begin. Linux does not offer a well-developed benchmarking application, and availability of what exists changes rapidly. You can search online for the most up-to-date information for benchmarking applications for Linux. If you are a system administrator, you might choose to create your own benchmarking tests. Second, tweak only one thing at a time so you can tell what works, what does not work, and what breaks things. Some of these tweaks might not work or might lock up your machine, but if you are only implementing them one at a time, you will find it much easier to reverse a change that caused a problem.
Always have a working boot disc handy, such as the live Ubuntu CD or DVD. Remember that you are personally assuming all risks for attempting any of these tweaks. If you don’t understand what you are doing or are not confident in your ability to revert any changes discussed here, do not attempt any of the suggestions in this chapter. The default settings in Ubuntu work very well for most people and really don’t need adjusting; just as most people can use and enjoy their car just as it is. However, some people love taking their cars apart and building hot rods; they enjoy tweaking and breaking and fixing them. This chapter is for that sort of person. If you don’t think you can fix it, don’t risk breaking it.
Using the BIOS and Kernel to Tune the Disk Drives
One method of tuning involves adjusting the settings in your BIOS. Because the BIOS is not Linux and every BIOS seems different, always read your motherboard manual for better possible settings and make certain that all the drives are detected correctly by the BIOS. Change only one setting at a time.
Linux provides a limited means to interact with BIOS settings during the boot process (mostly overriding them). In this section, you learn about those commands.
Other options are in the following list and are more fully outlined in the BOOTPROMPT HOWTO and the kernel documentation. These commands can be used to force the IDE controllers and drives to be optimally configured. However, YMMV (your mileage may vary) because these do not work for everyone.
idex=dma—This forces DMA support to be turned on for the primary IDE bus, where x=0, or the secondary bus, where x=1.
idex=autotune—This command attempts to tune the interface for optimal performance.
hdx=ide-scsi—This command enables SCSI emulation of an IDE drive. This is required for some CD-RW drives to work properly in write mode, and it might provide some performance improvements for regular CD-R drives, as well.
idebus=xx—This can be any number from 20 to 66; autodetection is attempted, but this can set it manually if dmesg says that it isn’t autodetected correctly or if you have it set in the BIOS to a different value (overclocked). Most PCI controllers are happy with 33.
pci=biosirq—Some motherboards might cause Linux to generate an error message saying that you should use this. Look in dmesg for it; if you do not see it, you don’t need to use it.
These options can be entered into /etc/lilo.conf or /boot/grub/grub.conf or GRUB2’s /boot/grub/grub.cfg in the same way as other options are appended.
The hdparm Command
The hdparm utility can be used by root to set and tune the settings for IDE hard drives. You would do this to tune the drives for optimal performance. After previously requiring a kernel patch and installation of associated support programs, the hdparm program is now included with Ubuntu. You should only experiment with the file systems mounted read-only because some settings can damage some file systems when used improperly. The hdparmcommand also works with CD-ROM drives and some SCSI drives.
The general format of the command is this:
Click here to view code image
matthew@seymour:~$ hdparm command device
The following command runs a hard disk test:
Click here to view code image
matthew@seymour:~$ hdparm -tT /dev/hda
You must replace /dev/hda with the location of your hard disk. hdparm then runs two tests: cached reads and buffered disk reads. A good IDE hard disk should be getting 400MBps to 500MBps for the first test, and 20MBps to 30MBps for the second. Note your scores and then try this command:
Click here to view code image
matthew@seymour:~$ hdparm -m16 -d1 -u1 -c1 /dev/hda
That enables various performance-enhancing settings. Now try executing the original command again. If your scores increase from the previous measurement, you should run this command:
Click here to view code image
matthew@seymour:~$ hdparm -m16 -d1 -u1 -c1 -k1 /dev/hda
The extra parameter tells hdparm to write the settings to disk so that they will be used each time you boot, ensuring optimal disk performance in the future.
The man entry for hdparm is extensive and contains useful detailed information, but because the kernel configuration selected by Ubuntu already attempts to optimize the drives, it might be that little can be gained through tweaking. Because not all hardware combinations can be anticipated by Ubuntu or by Linux, and performance gains are always useful, you’re encouraged to try.
Tip
You can use the hdparm command to produce a disk transfer speed result with
Click here to view code image
matthew@seymour:~$ hdparm -tT device
Be aware, however, that although the resulting numbers appear quantitative, they are subject to several technical qualifications beyond the scope of what is discussed and explained in this chapter. Simply put, do not accept values generated by hdparm as absolute numbers, but only as a relative measure of performance.
File System Tuning
Never content to leave things alone, Linux provides several tools to adjust and customize the file system settings. The belief is that hardware manufacturers and distribution creators tend to select conservative settings that will work well all the time, leaving some of the potential of your system leashed—that’s why you have chosen Ubuntu Unleashed to help you.
The Linux file system designers have done an excellent job of selecting default values used for file system creation, and any version of the Linux kernel after 2.6.x contains code for the IDE subsystem that significantly improves I/O (input/output) transfer speeds over older versions, obviating much of the need for special tweaking of the file system and drive parameters if you use IDE disks. Although these values work well for most users, some server applications of Linux benefit from file system tuning. As always, observe and benchmark your changes.
Synchronizing the File System with sync
Because Linux uses buffers when writing to devices, the write will not occur until the buffer is full, until the kernel tells it to, or if you tell it to by using the sync command. Traditionally, the command is given twice, as in the following:
Click here to view code image
matthew@seymour:~$ sync ; sync
To do it twice is overkill. Still, it can be helpful before the unmounting of certain types of media with slow write speeds (such as some USB hard drives or PCMCIA storage media), but only because it delays the user from attempting to remove the media too soon, not because two syncs are better than one.
The tune2fs Command
With tune2fs, you can adjust the tunable file system parameters on an ext2 or ext3 file system. A few performance-related items of note are as follows:
To disable file system checking, the -c 0 option sets the maximal mount count to zero.
The interval between forced checks can be adjusted with the -i option.
The -m option sets the reserved blocks percentage with a lower value, freeing more space at the expense of fsck having less space to write any recovered files.
Decrease the number of superblocks to save space with the -O sparse_super option. (Modern file systems use this by default.) Always run e2fsck after you change this value.
More space can be freed with the -r option that sets the number of reserved (for root) blocks.
Note that most of these uses of tune2fs free up space on the drive at the expense of the capability of fsck to recover data. Unless you really need the space and can deal with the consequences, just accept the defaults; large drives are now relatively inexpensive.
The e2fsck Command
This utility checks an ext2/ext3 file system. Some useful arguments taken from man e2fsck are as follows:
-c—Checks for bad blocks and then marks them as bad
-f—Forces checking on a clean file system
-v—Verbose mode
The badblocks Command
Although not a performance-tuning program per se, the utility badblocks checks an (preferably) unmounted partition for bad blocks. It is not recommended that you run this command by itself, but rather allow it to be called by fsck. It should be used directly only if you specify the block size accurately; don’t guess or assume anything.
The options available for badblocks are detailed in the man page. They allow for very low-level manipulation of the file system that is useful for data recovery by file system experts or for file system hacking, but they are beyond the scope of this chapter and the average user.
Disabling File Access Time
Whenever Linux reads a file, it changes the last access time (known as the atime). This is also true for your web server: If you are getting hit by 50 requests a second, your hard disk will be updating the atime 50 times a second. Do you really need to know the last time a file was accessed? If not, you can disable the atime setting for a directory by typing this:
Click here to view code image
matthew@seymour:~$ chattr -R +A /path/to/directory
The chattr command changes file system attributes, of which “don’t update atime” is one. To set that attribute, use +A and specify -R so that it is recursively set. /path/to/directory gets changed, and so do all the files and subdirectories it contains.
Kernel
As the Linux kernel developed over time, developers sought a way to fine-tune some of the kernel parameters. Before sysctl, those parameters had to be changed in the kernel configuration, and then the kernel had to be recompiled.
The sysctl command can change some parameters of a running kernel. It does this through the /proc file system, which is a “virtual window” into the running kernel. Although it might appear that a group of directories and files exist under /proc, that is only a representation of parts of the kernel. When you’re the root user (or using the sudo command), you can read values from and write values to those “files,” referred to as variables. You can display a list of the variables as shown in the following. (An abbreviated list is presented because roughly 250 items, or more, exist in the full list.)
Click here to view code image
matthew@seymour:~$ sysctl -A
net.ipv4.tcp_max_syn_backlog = 1024
net.ipv4.tcp_rfc1337 = 0
net.ipv4.tcp_stdurg = 0
net.ipv4.tcp_abort_on_overflow = 0
net.ipv4.tcp_tw_recycle = 0
net.ipv4.tcp_syncookies = 0
net.ipv4.tcp_fin_timeout = 60
net.ipv4.tcp_retries2 = 15
net.ipv4.tcp_retries1 = 3
net.ipv4.tcp_keepalive_intvl = 75
net.ipv4.tcp_keepalive_probes = 9
net.ipv4.tcp_keepalive_time = 7200
net.ipv4.ipfrag_time = 30
The items shown are networking parameters, and tweaking these values is beyond the scope of this book. If you want to change a value, however, you use the -w parameter:
Click here to view code image
matthew@seymour:~$ sysctl -w net.ipv4.tcp_retries 2=20
This increases the value of that particular kernel parameter.
If you find that a particular setting is useful, you can enter it into the /etc/sysctl.conf file. The format is as follows, using the earlier example:
net.ipv4.tcp_retries 2=20
Of more interest to kernel hackers than regular users, sysctl is a potentially powerful tool that continues to be developed and documented.
Tip
The kernel does a good job of balancing performance for graphical systems, so there’s not a great deal you can do to tweak your desktop to run faster.
Both GNOME and KDE are “heavyweight” desktop systems: They are all-inclusive, all-singing, and all-dancing environments that do far more than browse your file system. The drawback is that their size makes them run slow on older systems. On the flip side, Ubuntu has others available in the repositories like the Xfce and LXDE desktops, which are a great deal slimmer and faster than the other two. If you find GNOME and KDE are struggling just to open a file browser, Xfce or LXDE are likely for you.
Apache
Despite being the most popular web server on the Internet, Apache is by no means the fastest. Part of the “problem” is that Apache has been written to follow every applicable standard to the letter, so much of its development work has been geared toward standards compliancy rather than just serving web pages quickly. However, with a little tweaking you can convert an inexpensive middle-of-the-road server into something capable of surviving the Slashdot Effect (or “being Slashdotted”).
Note
Slashdot.org is a popular geek news website that spawned the Slashdot Effect—the result of thousands of geeks descending on an unsuspecting website simultaneously. Although Slashdot is still popular, other sites are newer and have gained great momentum and popularity (we’re looking at you, Reddit!). We are not trying to ignore the new sites, but rather honor the original. We freely acknowledge that many other wonderful sources of the effect exist.
The first target for your tuning should be the apache2.conf file in /etc/apache2, as well as the other files in /etc/apache2. The more modules you have loaded, the more load Apache is placing on your server. Take a look through the LoadModule list and comment out (start the line with a #) the ones you do not want. Some of these modules can be uninstalled entirely through the Add or Remove Packages dialog.
As a rough guide, you almost certainly need mod_mime and mod_dir, and probably also mod_log_config. The default Apache configuration in Ubuntu is quite generic, so unless you are willing to sacrifice some functionality you might also need mod_negotiation (a speed killer if there ever was one) and mod_access (a notorious problem). Both of those modules can and should work with little or no performance decrease, but all too often they get abused and just slow things down.
Whatever you do, when you are disabling modules you should ensure that you leave either mod_deflate or mod_gzip enabled, depending on your Apache version. Your bottleneck is almost certainly going to be your bandwidth rather than your processing power, and having one of these two compressing your content usually turns 10KB of HTML into 3KB for supported browsers (most of them).
Next, ensure keepalives are turned off. Yes, you read that right: Turn keepalives off. This adds some latency to people viewing your site because they cannot download multiple files through the same connection. However, it reduces the number of simultaneous open connections and so allows more people to connect.
If you are serving content that does not change, you can take the extreme step of enabling MMAP support. This allows Apache to serve pages directly from RAM without bothering to check whether they have changed, which works wonders for your performance. However, the downside is that when you do change your pages you need to restart Apache. Look for the EnableMMAP directive; it is probably commented out and set to off, so you will need to remove the comment and set it to on.
Finally, if speed is your greatest concern, you should do all you can to ensure that your content is static: Avoid PHP if you can, avoid databases if you can, and so on. If you know you are going to get hit by a rush of visitors, use plain HTML so that Apache is limited only by your bandwidth for how fast it can serve pages.
Tip
Some people, when questioned about optimizing Apache, recommend you tweak the HARD_SERVER_LIMIT in the Apache source code and recompile. Although we agree that compiling your own Apache source code is a great way to get a measurable speed boost if you know what you are doing, you should need to change this directive only if you are hosting a huge site.
The default value, 256, is enough to handle the Slashdot Effect, and if you can handle that, then you can handle most things.
MySQL
Tuning your MySQL server for increased performance is exceptionally easy to do, largely because you can see huge speed increases simply by getting your queries right. However, you can tune various things in the server itself to help it cope with higher loads as long as your system has enough RAM.
The key is understanding its buffers. There are buffers and caches for all sorts of things, and finding out how full they are is crucial to maximizing performance. MySQL performs best when it is making full use of its buffers, which in turn places a heavy demand on system RAM. Unless you have 4GB RAM or more in your machine, you do not have enough capacity to set very high values for all your buffers; you need to pick and choose.
Measuring Key Buffer Usage
When you add indexes to your data, it enables MySQL to find data faster. However, ideally you want to have these indexes stored in RAM for maximum speed, and the variable key_buffer_size defines how much RAM MySQL can allocate for index key caching. If MySQL cannot store its indexes in RAM, you will experience serious performance problems. Fortunately, most databases have relatively small key buffer requirements, but you should measure your usage to see what work needs to be done.
To do this, log in to MySQL and type SHOW STATUS LIKE '%key_read%'; that returns all the status fields that describe the hit rate of your key buffer. You should get two rows back: Key_reads and Key_read_requests, which are the number of keys being read from disk and the number of keys being read from the key buffer. From these two numbers you can calculate the percentage of requests being filled from RAM and from disk, using this simple equation:
Click here to view code image
100 - ((Key_reads / Key_read_requests) ∞ 100)
That is, you divide Key_reads by Key_read_requests, multiply the result by 100, and then subtract the result from 100. For example, if you have Key_reads of 1000 and Key_read_requests of 100000, you divide 1000 by 100000 to get 0.01; then you multiply that by 100 to get 1.0, and subtract that from 100 to get 99. That number is the percentage of key reads being served from RAM, which means 99 percent of your keys are served from RAM.
Most people should be looking to get more than 95 percent served from RAM, although the primary exception is if you update or delete rows very often—MySQL can’t cache what keeps changing. If your site is largely read-only, this should be around 98 percent. Lower figures mean you might need to bump up the size of your key buffer.
If you are seeing problems, the next step is to check how much of your current key buffer is being used. Use the SHOW VARIABLES command and look up the value of the key_buffer_size variable. It is probably something like 8388600, which is 8 million bytes, or 8MB. Now, use the SHOW STATUS command and look up the value of Key_blocks_used.
You can determine how much of your key buffer is being used by multiplying Key_blocks_used by 1024, dividing by key_buffer_size, and multiplying by 100. For example, if Key_blocks_used is 8000, multiply that by 1024 to get 8192000; then divide that by your key_buffer_size (8388600) to get 0.97656, and finally multiply that by 100 to get 97.656. Thus, almost 98 percent of your key buffer is being used.
Now, the important part: You have ascertained that you are reading lots of keys from disk, and you also know that the reason for reading from disk is almost certainly because you do not have enough RAM allocated to the key buffer. A general rule is to allocate as much RAM to the key buffer as you can, up to a maximum of 25 percent of system RAM—128MB on a 512MB system is about the ideal for systems that read heavily from keys. Beyond that, you will actually see drastic performance decreases because the system has to use virtual memory for the key buffer.
Open /etc/my.cnf in your text editor and look for the line that contains key_buffer_size. If you do not have one, you need to create a new one. It should be under the line [mysqld]. When you set the new value, do not just pick some arbitrarily high number. Try doubling what is there right now (or try 16MB if there’s no line already); then see how it goes. To set 16MB as the key buffer size, you need a line like this:
Click here to view code image
[mysqld]
set-variable = key_buffer_size=16M
datadir=/var/lib/mysql
Restart your MySQL server with service mysqld restart and then go back into MySQL and run SHOW VARIABLES again to see the key_buffer_size. It should be 16773120 if you have set it to 16M. Now, because MySQL just got reset, all its values for key hits and the like will also have been reset. You need to let it run for a while so that you can assess how much has changed. If you have a test system you can run, this is the time to run it.
After your database has been accessed with normal usage for a short while (if you get frequent accesses, this might be only a few minutes), recalculate how much of the key buffer is being used. If you get another high score, double the size again, restart, and retest. You should keep repeating this until your key buffer usage is below 50 percent or you find you don’t have enough RAM to increase the buffer further. Remember that you should never allocate more than 25 percent of system RAM to the key buffer.
Using the Query Cache
Newer versions of MySQL allow you to cache the results of queries so that, if new queries come in that use the same SQL, the result can be served from RAM. In some ways the query cache is quite intelligent: If, for example, part of the result changes due to another query, the cached results are thrown away and recalculated next time. However, in other ways it is very simple. For example, it uses cached results only if the new query is exactly the same as a cached query, even down to the capitalization of the SQL.
The query cache works well in most scenarios. If your site has an equal mix of reading and writing, the query cache does its best but is not optimal. If your site is mostly reading with few writes, more queries are cached (and for longer), thus improving overall performance.
First, you need to find out whether you have the query cache enabled. To do this, use SHOW VARIABLES and look up the value of have_query_cache. All being well, you should get YES back, meaning that the query cache is enabled. Next, look for the value of query_cache_size and query_cache_limit. The first is how much RAM in bytes is allocated to the query cache, and the second is the maximum result size that should be cached. A good starting set of values for these two is 8388608 (8MB) and 1048576 (1MB).
Next, type SHOW STATUS LIKE 'qcache%'; to see all the status information about the query cache. You should get output like this:
Click here to view code image
mysql> SHOW STATUS LIKE 'qcache%';
+-------------------------+--------+
| Variable_name | Value |
+-------------------------+--------+
| Qcache_free_blocks | 1 |
| Qcache_free_memory | 169544 |
| Qcache_hits | 698 |
| Qcache_inserts | 38 |
| Qcache_lowmem_prunes | 20 |
| Qcache_not_cached | 0 |
| Qcache_queries_in_cache | 18 |
| Qcache_total_blocks | 57 |
+-------------------------+--------+
8 rows in set (0.00 sec)
From that, we can see that only 18 queries are in the cache (Qcache_queries_in_cache); we have 169544 bytes of memory free in the cache (Qcache_free_memory), 698 queries have been read from the cache (Qcache_hits), 38 queries have been inserted into the cache (Qcache_inserts), but 20 of them were removed due to lack of memory (Qcache_lowmem_prunes), giving the 18 from before. Qcache_not_cached is 0, which means 0 queries were not cached—MySQL is caching them all.
From that, we can calculate how many total queries came in—it is the sum of Qcache_hits, Qcache_inserts, and Qcache_not_cached, which is 736. We can also calculate how well the query cache is being used by dividing Qcache_hits by that number and multiplying by 100. In this case, 94.84 percent of all queries are being served from the query cache, which is a great number.
In our example, we can see that many queries have been trimmed because there is not enough memory in the query cache. This can be changed by editing your /etc/my.cnf file and adding a line like this one, somewhere in the [mysqld] section:
Click here to view code image
set-variable = query_cache_size=32M.
An 8MB query cache should be enough for most people, but larger sites might need 16MB or even 32MB if you are storing a particularly large amount of data. Very few sites will need to go beyond a 32MB query cache, but keep an eye on the Qcache_lowmem_prunes value to ensure you have enough RAM allocated.
Using the query cache does not incur much of a performance hit. When MySQL calculates the result of a query normally, it throws it away when the connection closes. With the query cache, it skips the throwing away, and so there is no extra work being done. If your site does have many updates and deletes, be sure to check whether you get any speed boost at all from the query cache.
Miscellaneous Tweaks
If you have tuned your key buffer and optimized your query cache and yet still find your site struggling, you can make a handful of smaller changes that will add some more speed.
When reading from tables, MySQL has to open the file that stores the table data. How many files it keeps open at a time is defined by the table_cache setting, which is set to 64 by default. You can increase this setting if you have more than 64 tables, but be aware that Ubuntu imposes limits on MySQL about how many files it can have open at a time. Going beyond 256 is not recommended unless you have a particularly database-heavy site and know exactly what you are doing.
The other thing you can tweak is the size of the read buffer, which is controlled by read_buffer_size and read_buffer_rnd_size. Both of these are allocated per connection, which means you should be very careful to have large numbers. Whatever you choose, read_buffer_rnd_size should be three to four times the size of read_buffer_size, so if read_buffer_size is 1MB (suitable for very large databases), read_buffer_rnd_size should be 4MB.
Query Optimization
The biggest speed-ups can be seen by reprogramming your SQL statements so they are more efficient. If you follow these tips, your server will thank you:
Select as little data as possible. Rather than SELECT *, select only the fields you need.
If you need only a few rows, use LIMIT to select the number you need.
Declare fields as NOT NULL when creating tables to save space and increase speed.
Provide default values for fields and use them where you can.
Be very careful with table joins because they are the easiest way to write inefficient queries.
If you must use joins, be sure you join on fields that are indexed. They should also preferably be integer fields because these are faster than strings for comparisons.
Find and fix slow queries. Add log-long-format and log-slow-queries = /var/log/slow-queries.log to your /etc/my.cnf file, under [mysqld], and MySQL tells you the queries that took a long time to complete.
Use OPTIMIZE TABLE tablename to defragment tables and refresh the indexes.
References
www.coker.com.au/bonnie++/—The home page of bonnie, a disk benchmarking tool. It also contains a link to RAID benchmarking utilities and Postal, a benchmarking utility for SMTP servers.
www.phoronix-test-suite.com/—The Phoronix Test Suite was created by a website that does automated performance testing and comparisons and is a quality benchmarking software to consider.
http://httpd.apache.org/current/misc/perf-tuning.html—The official Apache guide to tuning your web server.
http://dev.mysql.com/doc/refman/5.7/en/optimization.html—Learn how to optimize your MySQL server direct from the source, the MySQL manual.
One particular MySQL optimization book will really help you get more from your system if you run a large site: High Performance MySQL, by Jeremy Zawodny and Derek Balling (O’Reilly), ISBN: 0-596-00306-4.