Test-Driven MySQL Development - Getting Started with MySQL Development - Expert MySQL, Second Edition (2012)

Expert MySQL, Second Edition (2012)

PART 1. Getting Started with MySQL Development

CHAPTER 4. Test-Driven MySQL Development

Systems integrators must overcome the limitations of the systems they are integrating, but sometimes a system lacks certain functions or commands that are needed for the integration. Oracle recognizes this and includes flexible options in the MySQL server that add new functions and commands. This chapter introduces a key element in generating high-quality extensions to the MySQL system. I discuss software testing and explain some common practices for testing large systems, using specific examples to illustrate the accepted practices of testing the MySQL system.


Why include a chapter about testing so early in the book? You need to learn about the testing capabilities available so that you can plan your own modifications by first planning how to test them. This is the premise of test-driven development: to develop and implement the tests from the requirements, write the code, and then immediately execute the tests. This may sound a tad counterintuitive to someone not familiar with this concept—after all, how do you write tests for code that hasn’t been written?

In the following sections, I’ll clarify by providing some background information about this increasingly popular concept.

Why Test?

When I lecture about software quality issues, I often get asked, “Why test?” Some students want to know how much testing is enough. I offer students who feel testing is largely a waste of time, or that it is highly overrated, the opportunity to complete their class projects1 using a minimal- or no-testing strategy. The results are often interesting and enlightening.2

These same students often discuss how well they code their modules and classes and how careful they are to use good modeling practices. Many use Unified Modeling Language (UML) diagrams to assist their software development. While these are good practices, testing involves a lot more than making sure your source code matches your model. Students who insist that their highly honed coding skills are sufficient often produce project deliverables that have feature and functionality issues.

Although most of these projects do not suffer from fatal errors or crashes (which are often found during development), they often have issues with integration and how the software works. That is, the students fail to ensure that their software works the way the customer expects it to.

If this has happened to you, you now know the value of software testing. Choosing which technique to use when is the real nature of the science of software testing.

image Tip Professional software testers (sometimes called quality control engineers or quality assurance engineers) have a very different view of software. If you ever work with a professional software tester, spend time learning to understand how they approach software testing. They often have incredible insight—which few developers ever hone successfully—into how software works. Don’t be embarrassed if they break your code—that’s their job, and most are very good at it!

Testing vs. Debugging

Although they often have the same goal—identifying defects, debugging and testing are not the same. Debugging is an interactive process designed to locate defects in the logic of the source code by exposing the internal workings of the source code. Testing, on the other hand, identifies defects in the execution of the source code without examining its inner workings.

Test-Driven Development

Test-driven development is often associated with agile programming and often used by organizations that adopt extreme programming (XP) methods. That may sound scary, but here’s a secret about XP: you don’t have to adopt XP to use agile practices!

I often encounter developers who are deeply concerned about adopting agile practices because of all the negative hype tossed about by uninformed people. Those who view traditional software engineering processes as cast in stone think that agile practices are designed to do more with less, and that, therefore, they are inferior. Others believe that agile practices “cut out the clutter” of requirements analysis and design to “focus on the code.” None of this is true.

Agile practices are designed to streamline software development, to re-engage the customer, to produce only what is needed when it is needed, and to focus on the job at hand (what the customer wants). The customer, not the process, is the focus of agile methods Clearly, the emphasis is on analysis and design.

Furthermore, agile practices are designed to be used either as a set or selectively in an application. That is, organizations are encouraged to adopt agile practices as they see fit rather than jumping in with both feet and turning their engineers’ world upside down. That is one reason behind the negative hype—that and the resulting failures reported by organizations that tried to do too much too soon.3 If you would like to learn more about the debate about agile versus traditional methods, direct your browser to the Agile Alliance web site,http://www.agilealliance.org.

One profoundly useful agile practice is test-driven development. The philosophy of test-driven development is simple: start with a basic model of the solution, write the test, run the test (which will fail), code the solution, and validate it with the test (when the unaltered test passes). While that sounds really intuitive, it is amazing how complicated it can become. Creating the test before the code sounds backward. How can you test something that doesn’t exist? How can that help?

Developing the test first allows you to focus on the design of your software rather than on the code. I’ll explain a typical test-driven agile development process so that you can see how test-driven development complements the design and actually drives the source code. That sounds weird, but give it a chance and it will make sense.

Test-driven development begins with a simple model of the system, usually a simple class diagram of the basic classes within the system. The class diagram is set with just the empty class blocks, annotated only with the proposed name of the class. I say proposed because this is usually the point at which developers who are used to traditional methods get stumped. In agile practices, nothing is set in stone, and anything can be a candidate for change (a practice known as refactoring)—It just has to make sense and to further the ultimate goal of producing the software that the customer wants.


Refactoring is a process of critical analysis that asks, “How can this be done better?” For example, when examining code for defect repair or adding new features, developers who practice refactoring look for better ways to reorganize the code to make it more efficient and easier to maintain. The goal of refactoring, therefore, is improve the nonfunctional aspects of the software.

While refactoring is often associated with source code, the premise applies to all areas of software development, from refining source code to refining tests for greater coverage.

Once an initial class diagram is created, it is copied, set aside, and referred to as the domain model, because it depicts the initial layout of your classes. From there, use case diagrams, and supplemental use-case scenarios (textual descriptions of the use case and alternative execution sequences) are created. Each use case is then augmented by a single sequence diagram that maps out the functions needed for the classes referenced.

As each class begins to take shape, you begin writing the tests. Even though the classes don’t exist, you still must write the tests—they form a hybrid of integration, system, and interface testing (all white-box techniques) in which each test exercises one of the classes in the domain model.

image Note White-box testing is testing without knowledge of how the system is constructed. Black-box testing is testing the behavior of the system given knowledge of its internal structures.

For most agile practices, at this point the lessons learned from the first iteration of this sequence are incorporated into the appropriate parts of the design (use case, sequence diagram, etc.) and the appropriate changes are made.

image Note Some agile practitioners add another modeling step to the process by using robustness diagrams. This ­adaptation closely resembles the ICONIX process. For more information about the ICONIX process, see “Agile ­Development with ICONIX Process.”4

Sometimes these changes include the discovery of new classes, the reorganization of the existing class, and even the formulation of the methods and properties of the class. In other words, writing the test before the code helps validate the design. That is really cool, because once you complete the level of design you want for your iteration and begin writing the source code, your tests are completed! You can simply run them and demonstrate that your code is working as designed. Of course, if you need to change the test, and therefore the design—well, that’s the beauty of agile development.


Benchmarking is designed to establish performance characteristics of software. You can use benchmarking to establish a known performance level (called a baseline) and then later run the benchmarks again after a change in the environment in which the software executes to determine the effects of those changes. This is the most common use of benchmarking. Others include identification of performance limits under load, managing change to the system or environment, and identifying conditions that may be causing performance problems.

You perform benchmarking by running a set of tests that exercise the system, and storing the performance counter results, called benchmarks. They are typically stored or archived and annotated with a description of the system environment. For example, savvy database professionals often include the benchmarks and a dump of the system configuration and environment in their archive. This permits them to compare how the system performed in the past with how it is currently performing and to identify any changes to the system or its environment.

The tests, normally of the functional variety, are targeted toward testing a particular feature or function of the system. Benchmarking tools include a broad range of tests that examine everything about the system, from the mundane to the most complex operations, under light, medium, and heavy loads.

Although most developers consider running benchmarks only when something odd happens, it can be useful to run the benchmarks at fixed intervals or even before and after major events, such as changes to the system or the environment. Remember to run your benchmarks the first time to create a baseline. Benchmarks taken after an event without a baseline will not be very helpful.

Guidelines for Good Benchmarks

Many good practices are associated with benchmarking. In this section, I take you through a few that I’ve found to be helpful in getting the most out of benchmarking.

First, always consider the concept of before-and-after snapshots. Don’t wait until after you’ve made a change to the server to see how it compares to the baseline you took six months ago—a lot can happen in six months. Instead, measure the system before the change, make the change, and then measure the system again. This will give you three metrics to compare: how the system is expected to perform, how it performs before the change, and how it performs after the change. You may find that something has taken place that makes your change more or less significant. For example, let’s say your benchmarks include a metric for query time. Your baseline established six months ago for a given test query was set at 4.25 seconds. You decide to modify the index of the table being tested. You run your before benchmark and get a value of 15.50, and your after benchmark produces a value of 4.5 seconds. If you had not taken the before picture, you wouldn’t have known that your change increased performance dramatically. Instead, you might have concluded that the change caused the query to perform a bit slower—which might have led you to undo that change, thus resulting in a return to slower queries.

This example exposes several aspects that I want to warn you about. If you are conducting benchmarks on the performance of data retrieval on systems that are expected to grow in the amount of data stored, you need to run your benchmarks more frequently, so that you can map the effects of the growth of data with the performance of the system. In the previous example, you would have considered the before value to be “normal” for the conditions of the system, such as data load.

Also, be careful to ensure that your tests are valid for what you are measuring. If you are benchmarking the performance of a query for a table, your benchmarks are targeted at the application level and are not useful for predicting the performance of the system in the general sense. Segregate application-level benchmarks from the more general metrics to avoid skewing your conclusions.

Another good practice that is related to the before-and-after concept is to run your benchmarks several times over a constrained period of activity (under a consistent load) to ensure that they are not affected by localized activity, such as a rogue process or a resource-intensive task. I find that running the benchmark up to several dozen times permits me to determine mean values for the results. You can create these aggregates using many techniques. For example, you could use a statistic package to create the basic statistics, or use your favorite statistical friendly spreadsheet application.5

image Note Some benchmark tools provide this feature for you. Alas, the MySQL Benchmark Suite does not.

Perhaps the most useful practice to adopt is changing one thing at a time. Don’t go through your server with a wide brush of changes and expect to conclude anything meaningful from the results. What often happens in this case is that one of the six or so changes negatively affects the gains of several others, and the remaining ones have little or no effect on performance. Unless you made one change at a time, you would have no idea which affected the system in a negative, positive, or neutral way.

Use real data whenever possible. Sometimes manufactured data contain data that fall neatly into the ranges of the fields specified and that therefore never test certain features of the system (domain and range checking, etc.). If your data can change frequently, you may want to snapshot the data at some point and build your tests using the same set of data each time. While this will ensure you are testing the performance using real data, however, it may not test performance degradation over time with respect to growth.

Last, when interpreting the results of your benchmarks and managing your expectations, set realistic goals. If you are trying to improve the performance of the system under certain conditions, have a firm grasp of the known consequences before you set your goals. For example, if you are examining the effect of switching the network interface from a gigabit connection to an interface that performs network communication 100 times faster, your server will not perform its data transfer 100 times faster. In this case and ones similar to it, the value added by the hardware should be weighed against the cost of the hardware and the expected gains of using the new hardware. In other words, your server should perform some percentage faster, thereby saving you money (or increasing income).

If you estimate that you need to increase your network performance by 10 percent in order to meet a quarterly expense and income goal that will lead to a savings, use that value as your goal. If your benchmarks show that you have achieved the desired improvements (or, better yet, surpassed them), ask your boss for a raise. If the benchmarks show performance metrics that don’t meet the goal, tell your boss you can save him money by returning the hardware (and then ask for a raise). Either way, you can back up your claims with empirical data: your benchmarks!

Benchmarking Database Systems

You might agree that benchmarking can be a very powerful tool in your arsenal, but what, exactly, do benchmarks have to do with database servers? The answer is—a lot. Knowing when your database server is performing its best allows you to set a bar for measuring performance degradation during heavy query-processing loads. More to the point, how would you decide whether a resource intensive query is the cause or the effect of a performance issue?

You can benchmark your database server on many levels. The most notable is benchmarking changes to your database schema. You would probably not create tests for a single table (although you can), but you are more likely to be interested in how the changes for a database schema affect performance.

This is especially true for new applications and databases. You can create several schemas, populate them with data, and write benchmark tests designed to mimic the proposed system. (Here’s that test-driven thing again.) By creating the alternative schemas and benchmarking them, and perhaps even making several iterations of changes, you can quickly determine which schemas are best for the application you are designing.

You can also benchmark database systems for specialized uses. For example, you might want to check the performance of your database system under various loads or in various environments. What better way to say for sure whether that new RAID device will improve performance than to run before-and-after benchmarks and know just how much of a difference the change to the environment makes? Yes, it is all about the cost, and benchmarking will help manage your database-system cost.


Sometimes a defect doesn’t manifest unless the system is under load. In these cases, the system may slow down but not produce any errors. How do you find those types of problems? You need a way to examine the system while it is running. This process is called profiling. Some authors group profiling with debugging, but profiling is more than just a debugging tool. Profiling allows you to identify performance bottlenecks and potential problems before they are detected in the benchmarks. Profiling is usually done after a problem is detected and sometimes as a means to determine its origins, however. You can discover or monitor memory and disk consumption, CPU usage, I/O usage, system response time, and many other system parameters using profiling.

The term profile (or profiler) is sometimes confused with performing the measurement of the targeted system parameters. The identification of the performance metric is called a diagnostic operation or technique (or sometimes a trace). A system that manages these diagnostic operations and permits you to run them against a system is called a profiler. Therefore, profiling is the application of diagnostic operations using a profiler.

Profilers typically produce reports that include machine-readable recordings of the system during a fixed time period. These types of performance measurements are commonly called traces, because they trace the path of the system over time. Other profilers are designed to produce human-readable printouts that detail specifics of what portion of the system executed the longest or, more commonly, where the system spent most of its time. This type of profiler is typically used to monitor resources such as I/O, memory, CPU, and threads or processes. For example, you can discover what commands or functions your threads and processes are performing. If your system records additional metadata in the thread or process headers, you may also discover performance issues with thread or process blocking and deadlocks.

image Note An example of deadlocking is when a process has a lock (exclusive access) to one resource and is waiting on another that is locked in turn by another process that is waiting for the first resource. Deadlock detection is a key ­attribute of a finely designed database system.

You can also use profiling to determine which queries are performing the poorest, and even which threads or processes are taking the longest to execute. In these situations, you may also discover that a certain thread or process is consuming a large number of resources (such as CPU or memory) and thus take steps to correct the problem. This situation is not uncommon in environments with a large community of users accessing central resources.

Sometimes certain requests of the system result in situations in which the actions of one user (legitimate or otherwise—let’s hope the legitimate kind) may be affecting others. In this case, you can correctly identify the troublesome thread or process and its owner, and take steps to correct the problem.

Profiling can also be a powerful diagnostic aid when developing systems, hence the tendency to call them debugging tools. The types of reports you can obtain about your system can lead you to all manner of unexpected inefficiencies in your source code. Take care not to overdo it, however. You can spend a considerable amount of time profiling a piece of source code that takes a long time to execute such that you may never fully meet your expectations of identifying the bottleneck. Remember, some things take a while to execute. Such is the case for disk I/O or network latency. Usually you can’t do a lot about it except redesign your architecture to become less dependent on slow resources. Of course, if you were designing an embedded real-time system, this may indeed be a valid endeavor, but it generally isn’t worth the effort to try to improve something you cannot control.

You should always strive to make your code run as efficiently as possible, however. If you find a condition in which your code can be improved using profiling, then by all means do it. Just don’t get carried away trying to identify or track the little things. Go after the big-ticket items first.


The differences between benchmarking and profiling are sometimes confused. Benchmarking establishes a performance rating or measurement. Profiling identifies the behavior of the system in terms of its performance.

While benchmarking is used to establish known performance characteristics under given configurations, profiling is used to identify where the system is spending most of its execution time. Benchmarking, therefore, is used to ensure the system is performing at or better than a given standard (baseline), whereas profiling is used to determine performance bottlenecks.

Introducing Software Testing

Software testing is increasingly vital to our industry, because it’s long been clear that a significant contributor to the failure of software systems is the lack of sufficient testing or time to conduct it.

The means by which the testing is conducted and the goals of testing itself are sometimes debated, however. For example, the goal of a well-designed test is to detect the presence of defects. That sounds right, doesn’t it? Think about that a moment—that means a successful test is one that has found a defect. So, what happens if the test doesn’t find any defects? Did the test fail because it was incorrectly written, or did it just not produce any errors? These debates (and many others) are topics of interest for software-testing researchers.

Some software testers (let’s just call them testers for short) consider a test successful if it doesn’t find any defects, which isn’t the same as stating that a successful test is one that finds defects. If you take the viewpoint of these testers, it is possible for a system to pass testing (all tests successful) and yet still have defects. In this case, the focus is on the test and not the software. Furthermore, if defects are found after testing, it is seldom considered a failure of the tests.

However, if you take the viewpoint that a successful test is one that finds defects, your tests fail only when the software has no defects. Thus, when no defects are found, the goal becomes making the tests more robust so that they can discover more defects.

Functional Testing vs. Defect Testing

Testers are often focused on ensuring the system performs the way the specification (also known as a requirements document) dictates. They often conduct tests that verify the functionality of the specification and therefore are not attempting to find defects. This type of testing is calledfunctional testing and sometimes system testing. Tests are created with no knowledge of the internal workings of the system (called black-box testing) and are often written as a user-centric stepwise exercise of a feature of the software. For example, if a system includes a print feature, functional tests can be written to execute the print feature using the preferred and alternate execution scenarios. A successful test in this case would show that the print feature works without errors and the correct output is given. Functional testing is just one of the many types of testing that software engineers and testers can use to ensure they produce a high-quality product.

The first viewpoint, defect testing, is the purposeful intent of causing the system to fail given a set of valid and invalid input data. These tests are often written with knowledge of the internal workings of the software (often referred to as white-box testing). Defect tests are constructed to exercise all the possible execution scenarios (or paths) through the source code for a particular component of the software while testing all of its gate and threshold conditions. For instance, if you were to write defect tests for the print feature example, you would write tests that test not only the correct operation of the feature but also every known error handler and exception trigger. That is, you would write the test to purposefully try to break the code. In this case, the defect test that completes without identifying defects can be considered a failed test (or simply negative—“failed” gives the impression that there is something wrong, but there isn’t; simply put, no errors were found in this case).6

For the purposes of this book, I present a combination of the functional and the defect testing viewpoints. That is, I show you how to conduct functional testing that has built-in features for detecting defects. The testing mechanism we’ll use allows you to conduct functional tests against the MySQL server using tests that execute SQL statements. Although you can construct tests that simply test functionality, you can also construct tests to identify defects. Indeed, I recommend that you write all of your tests to test the error handlers and exceptions. Should your test fail to identify a defect, or a bug is reported to you later, you can create a test or modify an existing test to test for the presence of that bug. That way, you can repeat the bug before you fix it and later show that the bug has been fixed.

Types of Software Testing

Software testing is often conducted in a constrained process that begins with analyzing the system requirements and design. Tests are then created using the requirements and design to ensure the quality (correctness, robustness, usability, etc.) of the software. As I mentioned earlier, some tests are conducted to identify defects and others are used to verify functionality without errors (which is not the same as not having defects). The goal of some testing techniques is to establish a critique or assessment of the software. These tests are typically focused on qualitative factors rather than quantitative results.

Testing is part of a larger software engineering mantra that ensures the software meets its requirements and delivers the desired functionality. This process is sometimes referred to as verification and validation. It is easy to confuse these. Validation simply means you are ensuring that the software was built to its specifications. Verification simply means that you followed the correct processes and methodologies to create it. In other words, validation asks, “Did we build the right product?” and verification asks, “Did we build the product right?”

While many software-development processes include verification and validation activities, most developers refer to the portion of the process that validates that the specifications are met as software testing. Moreover, the validation process is typically associated with testing the functions of the system and the absence of defects in the functionality rather than the correctness of the software.

You can conduct many types of software testing. Indeed, there are often spirited discussions during early project planning about what type of testing should or should not be required. Fortunately, most developers agree that testing is a vital component of software development. In my experience, however, few understand the role of the different types of software testing. Only you can choose what is right for your project. My goal is to explain some of the more popular types of software testing so that you can apply the ones that make the most sense for your needs.

The following sections describe popular software-testing techniques, their goals and applications, and how they relate to continuous test-driven development. As you will see, the traditional stages of testing are milestones in the continuous testing effort.

Integration Testing

Integration testing is conducted as the system is assembled from its basic building blocks. Tests usually are written to test first a single component, then that component and another, and so on, until the entire system is integrated. This form of testing is most often used in larger development projects that are built using semi-independent components.

Component Testing

Component testing is conducted on a semi-independent portion (or component) of the system in an isolated test run. That is, the component is exercised by calling all its methods and interrogating all its attributes. Component tests are usually constructed in the form of test harnesses that provide all the external communication necessary to test a component. This includes any dependent components, which are simulated using code scaffolding (sometimes called mock or stub components). These code scaffolds provide all the input and output necessary to communicate and exercise the component being tested.

Interface Testing

Interface testing is conducted on the interface of the component itself rather than on the component. The purpose is to show that the interface provides all the functionality required. This type of testing is usually done in coordination with component testing.

Regression Testing

Regression testing ensures that any addition or correction of the software does not affect other portions of it. In this case, tests that were run in the past are run again and the results compared to the previous run. If the results are the same, the change did not affect the functionality (insofar as the test is written). This type of testing normally uses automated testing software that permits developers (or testers) to run the tests unattended. The results are then compared after the bulk of tests are completed. Automated testing is a popular concept in the agile development philosophy.

Path Testing

Path testing ensures that all possible paths of execution are exercised. Tests are written with full knowledge of the source code (white-box testing) and are generally not concerned with conformance to specifications but rather with the system’s ability to accurately traverse all of its conditional paths. Many times, though, these tests are conducted with functionality in mind.

Alpha-Stage Testing

Traditionally, alpha-stage testing begins once a stable development-quality system is ready. This is typically early in the process of producing software for production use. Testing at this stage is sometimes conducted to ensure the system has achieved a level of stability at which most of the functionality can be used (possibly with minor defects). This may include running a partial set of tests that validate that the system works under guarded conditions. Systems deemed alpha are normally mostly complete and may include some known defect issues, ranging from minor to moderate. Typically, passing alpha testing concludes the alpha stage, and the project moves on to the beta stage.

It is at this point that the system is complete enough so that all tests are running against actual code and no scaffolding (stubbed classes) are needed. When the test results satisfy the project parameters for what is considered a beta, the project moves on to the beta stage.

Beta-Stage Testing

A project is typically considered a stable production-quality system when it boasts a complete set of functionality but may include some features that have yet to be made efficient or may require additional robustness work (hardening). Tests run at this stage are generally the complete set of tests for the features being delivered. If defects are found, they are usually minor. This type of testing can include tests conducted by the target audience and the customer. These groups tend to be less scientific in their approach to testing, but they offer developers a chance to vet their system with the customer and make any minor course corrections to improve their product. Passing beta testing means the software is ready to be prepared for eventual release.

In a test-driven development environment, beta testing is another milestone in the continuing testing effort. A beta under a test-driven development is normally the point at which the majority of the features are performing well with respect to the test results. The level of stability of the system is usually judged as producing few defects.

Release, Functional, and Acceptance Testing

Release testing is usually functional testing by which the system is validated that it meets its specifications, and this testing is conducted prior to delivery of the system to the customer. As with the beta stage, some organizations choose to involve the customer in this stage of testing. In this case, the testing method is usually called acceptance testing, as it is the customers who decide that the software is validated to meet their specifications. A test-driven development environment would consider these milestones as the completion of the tests.

Usability Testing

Usability testing is conducted after or near the completion of the system and is sometimes conducted parallel to functional and release testing. The goal of usability testing is to determine how well a user can interact with the system. There is usually no pass-or-fail result but rather a list of likes and dislikes. Though very subjective and based solely on the users’ preferences, usability testing can be helpful in creating software that can gain the loyalty of its users.

Usability testing is best completed in a laboratory designed to record the users’ responses and suggestions for later review. This allows the users to focus on the software without distractions. Most usability testing, however, is done in an informal setting where the developer observes the user using the system or where the user is given the software to use for a period of time and then her comments are taken as part of a survey or interview.

Reliability Testing

Reliability tests are usually designed to vary the load on the system and to challenge the system with complex data and varying quantities of load (data), and they are conducted in order to determine how well the system continues to run over a period of time. Reliability is typically measured in the number of hours the system continues to function and the number of defects per hour or per test.

Performance Testing

Performance testing is conducted either to establish performance behaviors (benchmarking) or to ensure that the system performs within established guidelines. Aspects of the system being examined sometimes include reliability as well as performance. Performance under extreme loads (known as stress testing) is sometimes examined during this type of testing.

image Note Usability, reliability, and performance testing are forms of testing that can be conducted in either a ­traditional testing or a test-driven development environment.

Test Design

Now that you have had a brief introduction to software testing and the types of testing that you can conduct in your own projects, let’s turn our attention to how tests are constructed. All the different philosophies for constructing tests ultimately intend to exercise, validate, or verify a certain aspect of the software or its process. Let’s look at three of the most prominent basic philosophies.

Specification-Based Tests

Specification-based tests (sometimes called functional tests) exercise the software requirements and design. The focus is to validate that the software meets its specification. These tests are usually constructed (and based on) a given requirement or group of requirements. Tests are organized into functional sets (sometimes called test suites). As a system is being built, the test sets can be run whenever the requirements are completed or at any time later in the process to validate continued compliance with the requirement (also known as regression testing).

Partition Tests

Partition tests focus on the input and output data characteristics of the system. They test the outer, edge, and mean value ranges of the input or output data being tested. For example, suppose a system is designed to accept input of a positive integer value in the range of 1 to 10. You can form partitions (called equivalence partitions or domains) of this data by testing the values {0, 1, 5, 10, 11}. Some may take this further and include a negative value, such as −1. The idea is that if the system does perform range checking, it is more likely that the boundary conditions will exhibit defects than will the valid, or even wildly invalid, data.

In our earlier example, there is no need to test values greater than 11 unless you want to test the internal data-collection code (the part of the system that reads and interprets the input). Most modern systems use system-level calls to manage the data entry that by their nature are very reliable (e.g., Microsoft Windows Forms). What is most interesting is you can form partitions for the output data as well. In this case, the tests are designed to exercise how the system takes in known data (good or bad) and produces results (good or bad). In this case, tests are attempting to validate the robustness aspect as well as accuracy of the processing the input data. Partition testing is useful in demonstrating that the system meets performance and robustness aspects.

Structural Tests

Structural tests (sometimes called architectural tests) ensure that the system is built according to the layout (or architecture) specified—that is, to verify that the system conforms to a prescribed construction. Tests of this nature are designed to ensure that certain interfaces are available and are working, and that components are working together properly. These categories of tests include all manner of white-box testing, where the goal is to exercise every path through the system (known as path testing). These tests can be considered of the verification variety, because they establish whether the architecture was built correctly and that it followed the prescribed process.

MySQL Testing

You can test the MySQL system in many ways. You can test the server connectivity and basic functionality using the mysqlshow command, run tests manually using the client tools, use the benchmarking tools to establish performance characteristics, and even conduct profiling on the server. The tools of choice for most database professionals are the MySQL Test Suite and the MySQL Benchmarking tool. The following sections describe each of these facilities and techniques.

Using the MySQL Test Suite

Oracle has provided a capable testing facility called the MySQL Test Suite, an executable named mysqltest and a collection of Perl modules and scripts designed to exercise the system and compare the results. Table 4-1 lists some of the pertinent directories and their contents. The test suite comes with the Unix/Linux binary and source distributions, although it is included in some Mac and Windows distributions.

Table 4-1. Directories under the mysql-test Directory




Groups of tests executed during integration and release testing


Main suite of result files


Test data for the test suite


Also called “suites,” a subfolder containing feature-specific tests, such as binlog, (storage) engines, and replication


The main suite tests

image Note The MySQL Test Suite does not currently run in the Windows environment. This would be an excellent project to take on if you wanted to contribute to the development of MySQL through the MySQL code-contribution program. It can be run in the Cygwin environment if the Perl environment is set up and the Perl DBI modules are installed. See “Perl Installation Notes” in the MySQL Reference Manual for more details.

When MySQL is installed, you will find the mysql-test-run.pl Perl script in the mysql-test directory under the installation directory. Best of all, the test suite is extensible. You can write your own tests and conduct testing for your specific application or need. The tests are designed as regression tests in the sense that the tests are intended to be run to ensure all the functionality works as it has in the past.

The tests are located in a directory under the mysql-test directory named simply /t. This directory contains nearly 740 tests along with more than 3,400 tests among all of the suites. While that may sound comprehensive, the MySQL documentation states that the test suite does not cover all the features or nuances of the system. The current set of tests is designed to detect bugs in most SQL commands, the operating system and library interactions, and cluster and replication functionality. Oracle hopes to ultimately accumulate enough tests to provide test coverage for the entire system. The goal is to establish a set of tests that test 100 percent of the features of the MySQL system. If you create additional tests that you feel cover a feature that isn’t already covered by one of the tests in the mysql-test/t directory, feel free to submit your tests to Oracle.

image Tip You can find more information about the MySQL Test Suite by visiting the MySQL Internals mailing list (see http://lists.mysql.com/ for more details and to see the available lists). You can also submit your tests for inclusion by sending an e-mail message to the list. If you decide to send your tests to Oracle for inclusion in the test suite, be sure you are using data that you can show the world. The tests are available to everyone. For example, I am sure your friends and relatives would not want their phone numbers showing up in every installation of MySQL!

For each test, a corresponding result file is stored in the mysql-test/r directory. The result file contains the output of the test run and is used to compare (using the diff command) the results of the test as it is run. In many ways, the result file is the benchmark for the output of the test. This enables you to create tests and save the expected results, then run the test later and ensure that the system is producing the same output.

You must use this premise with some caution. Data values that, by their nature, change between executions can be used, but they require additional commands to handle properly. Unfortunately, data values such as these are ignored by the test suite rather than compared directly. Thus, time and date fields are data types that could cause some issues if used in a test. I discuss more on this topic and other commands later.

Running Tests

Running tests using the test suite is easy. Simply navigate to the mysql-test directory and execute the command ./mysql-test-run.pl. This will launch the test executable, connect to the server, and run all the tests in the /t directory. Because running all the tests could take some time, Oracle has written the test suite to allow you to execute several tests in order. For example, the following command will run just the tests named t1, t2, and t3:

% > ./mysql-test-run.pl t1 t2 t3

The test suite will run each test in order but will stop if any test fails. To override this behavior, use the --force command-line parameter to force the test suite to continue.

The test suite is designed to execute its own instance of the mysqld executable. This may conflict with another instance of the server running on your machine. You may want to shut down other instances of the MySQL server before running the test suite. If you use the test suite from the source directory, you can create the mysqld executable by compiling the source code. This is especially handy if you want to test something you’ve changed in the server but do not want to or cannot take your existing server down to do so.

image Caution You can run the test suite alongside an existing server as long as the server is not using port 13000 and above. If it is, the test suite may not run correctly, and you may need to stop the server or change it to use other ports.

If you want to connect to a specific server instance, you can use the --extern command-line parameter to tell the test suite to connect to the server. If you have additional startup commands or want to use a specific user to connect to the server, you can add those commands as well. For more information about the available command-line parameters to the mysql-test-run script, enter the command % > ./mysql-test-run.pl --help.

Visit http://dev.mysql.com/doc/mysql/en/mysql-test-suite.html for more details.

image Note Using the --extern command-line parameter requires that you also include the name of the tests you want to execute. Some tests require a local instance of the server to execute. For example, the following command connects to a running server and executes the alias and analyze tests: perl mysql-test-run.pl --extern alias analyze.

Creating a New Test

To create your own test, use a standard text editor to create the test in the /t directory in a file named mytestname.test. For example, I created a sample test named cab.test (see Listing 4-1).

Listing 4-1. Sample Test

# Sample test to demonstrate MySQL Test Suite
LastName varchar(40),
FirstName varchar(20),
Gender varchar(2)) ENGINE = MYISAM;
EXPLAIN characters;
INSERT INTO characters (ID, LastName, FirstName, Gender)
VALUES (3, 'Flintstone', 'Fred', 'M');
INSERT INTO characters (ID, LastName, FirstName, Gender)
VALUES (5, 'Rubble', 'Barney', 'M');
INSERT INTO characters (ID, LastName, FirstName, Gender)
VALUES (7, 'Flintstone', 'Wilma', 'F');
INSERT INTO characters (ID, LastName, FirstName, Gender)
VALUES (9, 'Flintstone', 'Dino', 'M');
INSERT INTO characters (ID, LastName, FirstName, Gender)
VALUES (4, 'Flintstone', 'Pebbles', 'F');
INSERT INTO characters (ID, LastName, FirstName, Gender)
VALUES (1, 'Rubble', 'Betty', 'F');
INSERT INTO characters (ID, LastName, FirstName, Gender)
VALUES (6, 'Rubble', 'Bam-Bam', 'M');
INSERT INTO characters (ID, LastName, FirstName, Gender)
VALUES (8, 'Jetson', 'George', 'M');
SELECT * FROM characters;
# Cleanup
DROP TABLE characters;
# . . .and we're done.

Notice that the contents of the test are simply SQL commands that create a table, insert some data, and then do a few simple selects. Most tests are a bit more complex than this, but you get the idea. You create your test to exercise some set of commands (or data handling). Notice the first six lines. The first three are comment lines and they begin with a # symbol. Always document your tests with a minimal explanation at the top of the file to indicate what the test is doing. Use comments in the body of the test to explain any commands that aren’t easily understood (e.g., complex joins or user-defined functions). The fourth and sixth lines are interesting because they are issuing commands to the test suite. Test-suite commands always begin on a line with -- in front of them. These lines are directing the test suite to temporarily disable and then enable any warning messages from the server. This is necessary in case the table (characters) does not already exist. If I had left the warnings enabled, the test would have failed under this condition because:

· The server would have issued a warning, or

· The output would not match the expected results.

The general layout of your tests should include a cleanup section at the beginning to remove any tables or views that may exist as a result of a failed test. The body of the test should include all the necessary statements to complete the test, and the end of the test should include cleanup statements to remove any tables or views you’ve created in the test.

image Tip When writing your own tests, Oracle requests that you use table names such as t1, t2, t3, etc. and view names such as v1, v2, or v3, etc., so that your test tables do not conflict with existing ones.

Running the New Test

Once the test is created, you need to execute the test and create the baseline of expected results. Execute the following commands to run the newly created test named cab.test from the mysql-test directory:

% > touch r/cab.result
% > ./mysql-test-run.pl cab
% > cp r/cab.reject r/cab.result
% > ./mysql-test-run.pl cab

The first command creates an empty result file. This ensures that the test suite has something to compare to. The next command runs the test for the first time. Listing 4-2 depicts a typical first-run test result. Notice that the test suite indicated that the test failed and generated a difference report. This is because there were no results to compare to. I have omitted a number of the more mundane statements for brevity.

Listing 4-2. Running a New Test for the First Time

Logging: ./mysql-test-run.pl cab.test
120620 9:48:44 [Note] Plugin 'FEDERATED' is disabled.
120620 9:48:44 [Note] Binlog end
120620 9:48:44 [Note] Shutting down plugin 'CSV'
120620 9:48:44 [Note] Shutting down plugin 'MyISAM'
MySQL Version 5.6.6
Checking supported features. . .
- skipping ndbcluster
- SSL connections supported
- binaries are debug compiled
Collecting tests. . .
Checking leftover processes. . .
Removing old var directory. . .
Creating var directory '/source/mysql-5.6/mysql-test/var'. . .
Installing system database. . .
Using server port 49261



main.cab [ fail ]
Test ended at 2012-06-20 09:48:49

CURRENT_TEST: main.cab
--- /source/mysql-5.6/mysql-test/r/cab.result 2012-06-20 16:48:40.000000000 +0300
+++ /source/mysql-5.6/mysql-test/r/cab.reject 2012-06-20 16:48:49.000000000 +0300
@@ -0,0 +1,52 @@

[. . .]

+DROP TABLE characters;

mysqltest: Result length mismatch

- saving '/source/mysql-5.6/mysql-test/var/log/main.cab/' to '/source/mysql-5.6/mysql-test/var/log/main.cab/'
The servers were restarted 0 times
Spent 0.000 of 5 seconds executing testcases

Completed: Failed 1/1 tests, 0.00 % were successful.

Failing test(s): main.cab

The log files in var/log may give you some hint of what went wrong.

If you want to report this error, please read first the documentation
at http://dev.mysql.com/doc/mysql/en/mysql-test-suite.html

mysql-test-run: *** ERROR: there were failing test cases

The difference report shows the entire test result because our result file was empty. Had we run a test that was modified, the difference report would show only those portions that differ. This is a very powerful tool that developers use to run regression tests whenever they make changes to the code. It shows them exactly how their code affects the known outcome of the system operation.

The next command copies the newest results from the cab.reject file over the cab.result file. You only do this step once you are certain that the test runs correctly and that there are no unexpected errors. One way to ensure this is to run the test statements manually and verify that they work correctly. Only then should you copy the reject file to a result file. Listing 4-3 depicts the result file for the new test. Notice that the output is exactly what you would expect to see from a manual execution, minus the usual pretty printout and column spacing.

Listing 4-3. The Result File

LastName varchar(40),
FirstName varchar(20),
Gender varchar(2)) ENGINE = MYISAM;
EXPLAIN characters;
Field Type Null Key Default Extra
ID int(11) NO PRI NULL
LastName varchar(40) YES NULL
FirstName varchar(20) YES NULL
Gender varchar(2) YES NULL
INSERT INTO characters (ID, LastName, FirstName, Gender)
VALUES (3, 'Flintstone', 'Fred', 'M');
INSERT INTO characters (ID, LastName, FirstName, Gender)
VALUES (5, 'Rubble', 'Barney', 'M');
INSERT INTO characters (ID, LastName, FirstName, Gender)
VALUES (7, 'Flintstone', 'Wilma', 'F');
INSERT INTO characters (ID, LastName, FirstName, Gender)
VALUES (9, 'Flintstone', 'Dino', 'M');
INSERT INTO characters (ID, LastName, FirstName, Gender)
VALUES (4, 'Flintstone', 'Pebbles', 'F');
INSERT INTO characters (ID, LastName, FirstName, Gender)
VALUES (1, 'Rubble', 'Betty', 'F');
INSERT INTO characters (ID, LastName, FirstName, Gender)
VALUES (6, 'Rubble', 'Bam-Bam', 'M');
INSERT INTO characters (ID, LastName, FirstName, Gender)
VALUES (8, 'Jetson', 'George', 'M');
SELECT * FROM characters;
ID LastName FirstName Gender
3 Flintstone Fred M
5 Rubble Barney M
7 Flintstone Wilma F
9 Flintstone Dino M
4 Flintstone Pebbles F
1 Rubble Betty F
6 Rubble Bam-Bam M
8 Jetson George M
id select_type table type possible_keys key key_len ref rows Extra
1 SIMPLE characters ALL NULL NULL NULL NULL 8 Using temporary
DROP TABLE characters;

Last, we rerun the test using the expected results, and the test suite reports that the test passed. Listing 4-4 depicts a typical test result.

Listing 4-4. A Successful Test Run

Logging: ./mysql-test-run.pl cab.test
120620 9:53:19 [Note] Plugin 'FEDERATED' is disabled.
120620 9:53:19 [Note] Binlog end
120620 9:53:19 [Note] Shutting down plugin 'CSV'
120620 9:53:19 [Note] Shutting down plugin 'MyISAM'
MySQL Version 5.6.6
Checking supported features. . .
- skipping ndbcluster
- SSL connections supported
- binaries are debug compiled
Collecting tests. . .
Checking leftover processes. . .
Removing old var directory. . .
Creating var directory '/source/mysql-5.6/mysql-test/var'. . .
Installing system database. . .
Using server port 49273


main.cab [ pass ] 28
The servers were restarted 0 times
Spent 0.028 of 7 seconds executing testcases

Completed: All 1 tests were successful.

Creating your own tests and running them is easy. You can repeat the process I just described as many times as you want for as many tests as you want. This process follows the spirit of test-driven development by first creating the test, running it without proof of results, creating the solution (the expected results), and then executing the test and verifying successful test completion. I encourage you to adopt the same philosophy when creating your own MySQL applications and especially when extending the MySQL server.

For example, say you want to create a new SHOW command. In this case, you should create a new test to execute the new command, run it, and establish the test results. Naturally, the test will fail every time until you actually create the new command. The benefit is that it allows you to focus on the results of the command and how the command syntax should be prior to actually writing the code. If you adopt this practice for all your development, you won’t regret it and will see dividends in the quality of your code. Once you have implemented the command and verified that it works by running the test again and examining the reject file (or running the command manually), you can copy the reject file to the result file, which the test suite will use for verification (pass/fail) in later test runs.

Advanced Tests

The MySQL Test Suite provides a rich set of commands you can use to create powerful tests. This section introduces some of the more popular and useful commands. Unfortunately, no comprehensive document explains all the available commands. The following are those that I found by exploring the supplied tests and online posts.

image Tip If you use the advanced test-suite commands, you can create the result file using the --record command-line parameter to record the proper results. For example, you can run the command ./mysql-test-run.pl --record cab to record the results of the cab test file.

If you’re expecting a certain error to occur (say you’re testing the presence of errors rather than the absence of detecting them), you can use the --error num command. This command tells the test suite that you expect the error specified and that it should not fail the test when that error occurs. This command is designed to precede the command that produces the error. You can also specify additional error numbers, separated by commas. For example, --error 1550, 1530 indicates these (fictional) errors are permitted for the command that follows.

You can also use flow of control code inside your test. For example, you can use a loop to execute something for a fixed number of times. The following code executes a command 100 times:

let $1 = 100;
while ($1)
# Insert your commands here

Another useful command is sleep. The sleep command takes as a parameter the number of seconds to pause before executing the next command. For example, --real_sleep 3.5 tells the test suite to pause for 3.5 seconds before executing the next command. This command can help if there is unexpected latency in the network or if you’re experiencing tests failing due to heavy traffic. Using the sleep command will allow you to slow down the test, thereby reducing any interference due to poor performance by external factors.

Using sleeps for making tests deterministic by overcoming timing issues within the server code is a poor practice, however. For example, inserting sleeps to test queries run from multiple connections is a bad idea. Given the nature of multithreaded execution, coordinating the results using sleeps alone is insufficient, and while using sleeps may enable a test to be deterministic on one machine, (in this case, connection 1 returning results before connection 2), it may not work on another machine.

If you are interested in seeing additional information about a command, use the --enable_metadata command. This produces and displays internal metadata that may assist you in debugging commands for a complex test. Similarly, if you want to suppress the recording of the output, use --disable_result_log to turn off recording and --enable_result_log to turn it back on.

If you have commands that result in data that may change between runs (such as date or time fields), tell the test suite to ignore those values by substituting another character string using the --replace_column column string command. For example, if your output produces the current time in the second column (column counting begins at 1, not 0), you can use the command --replace_column 2 CURTIME. This tells the test suite that the output from the next command is to have column 2 replaced with the string “CURTIME.” While this does suppress the actual value in the output, it provides a way to ignore those values that cannot be predicted because they change between test runs.

Finally, if you need to include additional test commands within a test, you can use the --source include/filetoinclude.inc to include a file from the mysql-test/include directory. This practice is typical in tests that form a test suite with a set of commonly used commands.

Reporting Bugs

You could find a bug as the result of running one of the tests or in the creation of your own test. Oracle welcomes feedback on the test suite and has provided a means of reporting bugs. Before you fire up your e-mail and crank out an intensive report of the failure, however, be sure to confirm the bug thoroughly.

Oracle asks that you run the test on its own and discover the exact command and error codes behind the failure. First determine if the errors are the result of your environment (see “Operating System-Specific Notes” in the MySQL Reference Manual for potential issues; visithttp://dev.mysql.com/doc/refman/5.6/en/ installing.html for more details) by running the test either on a fresh installation or on another known-good installation. You should also run the commands in the test manually to confirm the error and error codes. Sometimes running the commands manually will reveal additional information you could not get otherwise. It may also help to run the server in de-bug mode. Last, if the test and error conditions can be repeated, include the test file, test results, test reject file, and any test data to MySQL when you submit your bug report.

MySQL Benchmarking

Oracle has provided a capable benchmarking facility called the MySQL Bench-marking Suite. This is a collection of Perl modules and scripts designed to exercise the system saving the performance metrics. The benchmarking suite comes with most binary and source distributions and can be run on Windows.7 When MySQL is installed, you will find the run-all-tests script in the sql-bench directory under the installation directory. The tests are designed in the regression-test sense in that they are intended to record the performance of the system under current conditions. The benchmarking suite is also available as a separate download for most operating systems from the MySQL developer web site (http://dev.mysql.com).

image Note You will need to install the dbi and DBD::mysql module to use the benchmarking tools. You can download these from the MySQL download page http://dev.mysql.com/downloads/dbi.html.

Like most benchmarking tools, the MySQL Benchmarking Suite is best used to determine the effects of changes to the system and the environment. The benchmarking suite differs somewhat from the testing suite in that the benchmarking suite has the ability to run benchmarks against other systems—you can use the benchmarking suite to run the same benchmarks against your MySQL, Oracle, and Microsoft SQL Server installations. As you can imagine, doing so can help you to determine how much better MySQL performs in your environment than your existing database system. To run the benchmarks against the other servers, use the --server = 'server' command-line switch. Values for this parameter include MySQL, Oracle, Informix, and MS-SQL.

A host of command-line parameters control the benchmarking suite. Table 4-2 lists a few popular ones and an explanation of each. See the README file in the sql-bench directory for more information about the command-line parameters.

Table 4-2. Command-Line Parameters for the MySQL Benchmarking Suite

Command-Line Parameter



Saves the results of the benchmarks to a file. Use with the --dir option to specify a directory to store the results in. Result files are named using the same output of the Unix command uname –a.


Specifies the user to log into the server.


Specifies the password of the user for logging in to the server.


Specifies the hostname of the server.


Specifies running the minimal benchmarking tests. Omitting this parameter executes the entire benchmarking suite of tests. For most uses, the small tests are adequate for determining the more common performance metrics.

To run the benchmarking suite of tests, simply navigate to the sql-bench directory under your installation and run the command run-all-tests. Notice one important characteristic of the benchmarking suite: all tests are run serially. Thus, the tests are run one at a time. To test the performance of multiple processes, or threads, use a third-party benchmarking suite, such as Sysbench and DBT2.

Another limitation of the benchmarking suite is that it is not currently extensible. That is, there is no facility to create your own tests for your own application. The source code is freely distributed, however, so those of you well versed in Perl can have at it. If you do create your own tests, be sure to share them with the global community of developers. You never know—someone might need the test you create.


SysBench and DBT2 are benchmarking suites designed to test the system under load. They are available at http://sourceforge.net/projects/sysbench/, and http://osdldbt.sourceforge.net/#dbt2.

image Tip For best results, disable the MySQL query cache before running benchmarks. Turn off the query cache by issuing the command SET GLOBALS query_cache_size = 0; in the MySQL client interface. This will allow your benchmarks to record the actual time of the queries rather than the time the system takes to retrieve the query from the cache. You’ll get a more accurate reading of the performance of your system.

If the base set of benchmarks is all that you need, you can run the command run-all-tests --small-test and generate the results for the basic set of tests. While running all of the tests ensures a more thorough measurement of the performance, it can also take a long time to complete. If, on the other hand, you identify a particular portion of the system you want to measure, you can run an individual test by executing the test independently. For example, to test the connection to the server, you can run the command test-connect. Table 4-3 lists a few of the independent tests available.

Table 4-3. Partial List of Benchmarking Tests




Creates a number of tables and several selects on them


Tests the connection speed to the server


Tests how quickly a table is created


Tests create and fill operations of a table


Runs a port of the PostgreSQL version of this benchmark

image Note The benchmarking suite runs the tests in a single thread. Oracle has plans to add multi-threaded tests to the benchmark suite in the future.

For more information about other forms of benchmarking available for MySQL, see Michael Kruckenberg and Jay Pipes’s Pro MySQL.8 It is an excellent reference for all things MySQL.

Running the Small Tests

Let’s examine what you can expect when you run the benchmarking tools on your system. In this example, I ran the benchmarking suite using the small tests on my Linux system. Listing 4-5 shows the top portion of the output file generated.

Listing 4-5. Excerpt of Small Tests Benchmark

$ ./run-all-tests --small-test --password = XXX --user = root
Benchmark DBD suite: 2.15
Date of test: 2012-06-20 10:39:38
Running tests on: Linux 2.6.38-15-generic x86_64
Arguments: --small-test
Limits from:
Server version: MySQL 5.1.63 0ubuntu0.11.04.1
Optimization: None

alter-table: Total time: 1 wallclock secs ( 0.02 usr 0.01 sys + 0.00 cusr 0.00 csys = 0.03 CPU)
ATIS: Total time: 2 wallclock secs ( 0.75 usr 0.23 sys + 0.00 cusr 0.00 csys = 0.98 CPU)
big-tables: Total time: 0 wallclock secs ( 0.06 usr 0.01 sys + 0.00 cusr 0.00 csys = 0.07 CPU)
connect: Total time: 1 wallclock secs ( 0.39 usr 0.14 sys + 0.00 cusr 0.00 csys = 0.53 CPU)
create: Total time: 1 wallclock secs ( 0.02 usr 0.01 sys + 0.00 cusr 0.00 csys = 0.03 CPU)
insert: Total time: 3 wallclock secs ( 1.22 usr 0.24 sys + 0.00 cusr 0.00 csys = 1.46 CPU)
select: Total time: 3 wallclock secs ( 1.27 usr 0.21 sys + 0.00 cusr 0.00 csys = 1.48 CPU)
transactions: Test skipped because the database doesn't support transactions
wisconsin: Total time: 4 wallclock secs ( 1.40 usr 0.49 sys + 0.00 cusr 0.00 csys = 1.89 CPU)

All 9 test executed successfully

At the top of the listing, the benchmarking suite gives the metadata describing the tests run, including the date the tests were run, the version of the operating system, the version of the server, and any special optimization or hardware installed (in this case, none). Look at what follows the metadata. You see the results of each of the tests run reporting the wall clock elapsed seconds. The times indicated in the parentheses are the times recorded during the execution of the benchmark suite itself and should be deducted from the actual wall clock seconds for accurate times. Don’t be too concerned about this, as this section is mostly used for a brief look at the tests in groups. The next section is the most interesting of all because it contains the actual data collected during each test. The results of the example benchmark tests are shown in Table 4-4. Notice there are a lot of operations being tested.

Table 4-4. Specific Test Result Data of the Small Tests Run (Totals per Operation)


When performing benchmarks, I like to convert the latter part of the listing to a spreadsheet so that I can perform statistical analysis on the results. This also allows me to perform calculations using the expected, before, and after results. Table 4-4 shows the time spent for each operation in total seconds, the time spent in the benchmarking tools (usr, sys, cpu), and the number of tests run for each operation.

Notice that at the bottom of Table 4-4 the columns are summed, giving you the total time spent executing the benchmark tests. This information, combined with that in Listing 4-1, forms the current baseline of the performance of my Windows system. I encourage you to create and archive your own benchmarks for your database servers.

Running a Single Test

Suppose you are interested in running the benchmark for creating tables. As shown in Table 4-3 the test is named test-create. To run this command, I navigated to the sql-bench directory and entered the command perl test-create. Listing 4-6 shows the results of running this command on my Windows system.

Listing 4-6. Output of Test—Create Benchmark Test

$ ./test-create --user = root --password = XXXX
Testing server 'MySQL 5.1.63 0ubuntu0.11.04.1' at 2012-06-20 10:40:59

Testing the speed of creating and dropping tables
Testing with 10000 tables and 10000 loop count

Testing create of tables
Time for create_MANY_tables (10000): 85 wallclock secs ( 0.87 usr 0.18 sys + 0.00 cusr 0.00 csys = 1.05 CPU)

Accessing tables
Time to select_group_when_MANY_tables (10000): 2 wallclock secs ( 0.22 usr 0.12 sys + 0.00 cusr 0.00 csys = 0.34 CPU)

Testing drop
Time for drop_table_when_MANY_tables (10000): 1 wallclock secs ( 0.18 usr 0.09 sys + 0.00 cusr 0.00 csys = 0.27 CPU)

Testing create + drop
Time for create + drop (10000): 83 wallclock secs ( 1.18 usr 0.50 sys + 0.00 cusr 0.00 csys = 1.68 CPU)
Time for create_key + drop (10000): 86 wallclock secs ( 1.45 usr 0.43 sys + 0.00 cusr 0.00 csys = 1.88 CPU)
Total time: 257 wallclock secs ( 3.90 usr 1.32 sys + 0.00 cusr 0.00 csys = 5.22 CPU)

In Listing 4-6, you see the typical parameters captured for each test run. Notice that the test is designed to run many iterations of the same test. This is necessary to ensure that the timings aren’t dependent on any single event and have more meaning when used as a set.

I chose this example so that you can consider another use of benchmarking. Suppose you want to create a new CREATE SQL command. In this case, you can modify the test-create script to include tests of your new command. Then, later ,run the benchmark tests to establish the baseline performance of your new command. This is a powerful tool to use in your extension of the MySQL system. I encourage you to explore this option if you have any performance or even scalability requirements or concerns about your extensions.

Applied Benchmarking

I return to this topic before moving on as it is important to understand and appreciate the benefits of benchmarking. The only way benchmarking will be useful to you is if you archive your results. I find the best solution is to tuck the results away in individual directories named by the date the benchmarks were taken. I recommend placing the output files (from the --log parameter) along with a short description of the current configuration of the system and the environment (use your favorite system inspection software to do this) into a separate directory for each set of benchmarking tests.

If I need to compare the performance of the system to a known state—for example, whenever I change a server variable and want to see its effect on performance—I can run the benchmarking tools before and after I make the change. Then I can look back through the history of the benchmarks and compare these results with the most stable state. This approach also allows me to track changes in system performance over time.

Benchmarking used in this way will enable you to manage your systems on a level few have achieved otherwise.

MySQL Profiling

Although no formal profiling tool or suite is included in the MySQL server suite of tools (or the source distribution), a number of diagnostic utilities available can be used as a simple set of profiling techniques. For example, you can check the status of thread execution, examine the server logs, and even examine how the optimizer will execute a query.

To see a list of the current threads, use the MySQL SHOW FULL PROCESSLIST command. This command shows all the current processes, or threads, running; the user running them; the host the connection was issued from; the database being used; current command; execution time; state parameters; and additional information provided by the thread. For example, if I ran the command on my system, the results would be something like what is shown in Listing 4-7.

Listing 4-7. Output of the SHOW FULL PROCESSLIST Command

*************************** 1. row ***************************
Id: 7
User: root
Host: localhost:1175
db: test
Command: Query
Time: 0
State: NULL
1 row in set (0.00 sec)

This example shows that I am the only user connected and running from the local host. The example shows the connection executed a query with an execution time of 0. The example also shows the command issued. The downside to this command is that it is a snapshot in time and must be run many times to detect patterns of performance bottlenecks. Fortunately, you can use a tool called mytop that repeatedly calls the command and displays several useful views of the data. For more information or to download mytop, see Jeremy Zawodny’s web site (http://jeremy.zawodny.com/mysql/mytop).

image Note The mytop application has had limited success on the Windows platform.

Another useful command for displaying server information is SHOW STATUS. This command displays all the server and status variables. As you can imagine, that is a very long list. Fortunately, you can limit the display by passing the command a LIKE clause. For example, to see the thread information, enter the command SHOW STATUS LIKE "thread%";. Listing 4-8 shows the results of this command.

Listing 4-8. The SHOW STATUS Command

mysql > SHOW STATUS LIKE "threads%";
+−−----------------- + −−-----+
| Variable_name | Value |
+−−----------------- + −−-----+
| Threads_cached | 0 |
| Threads_connected | 1 |
| Threads_created | 6 |
| Threads_running | 1 |
+−−----------------- + −−-----+
4 rows in set (0.00 sec)

To examine the slow-query log, you can set the log-slow-queries variable and set the query timeout using the long-query-time variable. Typical values for the long query timeout vary, but they should be set to your own concept of what constitutes a long query. To display the slow queries, use the mysqldumpslow command to display the slow queries. This command groups the slow queries by similarity (also called buckets). Additional metadata provided include information on locks, expected rows and actual rows generated, and the timing data.

The general query log can be examined using the MySQL Workbench software. You can view all of the logs provided you are connected to the server locally. If you have never used the MySQLAdminstrator software, download it from http://dev.mysql.com/downloads and give it a try.

image Tip You can use the MySQL Workbench software to control almost every aspect of the server, including startup settings, logging, and variables.

You can also examine resources used during a session by using the SHOW PROFILES command. To use this command, profiling must be turned on and is controlled with the profiling session variable. To turn profiling on, issue the SET PROFILING = ON; command.

image Note Profiling is only partially supported on some platforms. Platforms that do not support the getrusage() method return NULL. Also, profiling is per-process and may be affected by activity on other threads.

Once profiling is on, issuing the SHOW PROFILES command will display a list of the fifteen most recent queries issued in the current session. You can change the size of the list by modifying the value for the profiling_history_size variable. This list will show the query id (sequential), duration, and command.

To see specifics for a particular query, use the SHOW PROFILE FOR QUERY N command. This will display the state and duration of the query as it was executed. You can add options such as BLOCK IO, CONTEXT SWITCHES, CPU, etc. or ALL to show more details. For a complete listing of the options for the SHOW PROFILE FOR QUERY N command, see the syntax from the online MySQL Reference Manual http://dev.mysql.com/doc/refman/5.6/en/show-profiles.html. An example use of these commands is shown in Listing 4-9.

Listing 4-9. The SHOW PROFILES and SHOW PROFILE commands

mysql> SHOW VARIABLES LIKE 'profiling';
+−−------------- + −−-----+
| Variable_name | Value |
+−−------------- + −−-----+
| profiling | OFF |
+−−------------- + −−-----+
1 row in set (0.00 sec)

mysql> SET profiling = ON;
Query OK, 0 rows affected (0.00 sec)

mysql> SHOW VARIABLES LIKE 'profiling';
+−−------------- + −−-----+
| Variable_name | Value |
+−−------------- + −−-----+
| profiling | ON |
+−−------------- + −−-----+
1 row in set (0.00 sec)

mysql> use sakila;
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Database changed
mysql> show full tables;
+−−-------------------------- + −−----------+
| Tables_in_sakila | Table_type |
+−−-------------------------- + −−----------+
| actor | BASE TABLE |
| actor_info | VIEW |
| address | BASE TABLE |
| category | BASE TABLE |
| city | BASE TABLE |
| country | BASE TABLE |
| customer | BASE TABLE |
| customer_list | VIEW |
| film | BASE TABLE |
| film_actor | BASE TABLE |
| film_category | BASE TABLE |
| film_list | VIEW |
| film_text | BASE TABLE |
| inventory | BASE TABLE |
| language | BASE TABLE |
| nicer_but_slower_film_list | VIEW |
| payment | BASE TABLE |
| rental | BASE TABLE |
| sales_by_film_category | VIEW |
| sales_by_store | VIEW |
| staff | BASE TABLE |
| staff_list | VIEW |
| store | BASE TABLE |
+−−-------------------------- + −−----------+
23 rows in set (0.00 sec)

mysql> select * from sales_by_store;
+−−------------------- + −−------------ + −−-----------+
| store | manager | total_sales |
+−−------------------- + −−------------ + −−-----------+
| Woodridge,Australia | Jon Stephens | 33726.77 |
| Lethbridge,Canada | Mike Hillyer | 33679.79 |
+−−------------------- + −−------------ + −−-----------+
2 rows in set (0.05 sec)

mysql> show profiles;
+−−-------- + −−---------- + −−-------------------------------+
| Query_ID | Duration | Query |
+−−-------- + −−---------- + −−-------------------------------+
| 1 | 0.00039000 | SHOW VARIABLES LIKE 'profiling' |
| 2 | 0.00008800 | SELECT DATABASE() |
| 3 | 0.00023600 | show databases |
| 4 | 0.00020100 | show tables |
| 5 | 0.00078500 | show full tables |
| 6 | 0.04890200 | select * from sales_by_store |
+−−-------- + −−---------- + −−-------------------------------+
6 rows in set (0.00 sec)

mysql> show profile for query 6;
+−−-------------------- + −−--------+
| Status | Duration |
+−−-------------------- + −−--------+
| starting | 0.000034 |
| checking permissions | 0.000006 |
| Opening tables | 0.000200 |
| checking permissions | 0.000004 |
| checking permissions | 0.000002 |
| checking permissions | 0.000001 |
| checking permissions | 0.000002 |
| checking permissions | 0.000001 |
| checking permissions | 0.000002 |
| checking permissions | 0.000001 |
| checking permissions | 0.000047 |
| init | 0.000011 |
| System lock | 0.000010 |
| optimizing | 0.000003 |
| optimizing | 0.000016 |
| statistics | 0.000064 |
| preparing | 0.000020 |
| Creating tmp table | 0.000013 |
| Sorting for group | 0.000006 |
| Sorting result | 0.000003 |
| statistics | 0.000004 |
| preparing | 0.000004 |
| executing | 0.000007 |
| Sending data | 0.000006 |
| executing | 0.000002 |
| Sending data | 0.048299 |
| Creating sort index | 0.000033 |
| removing tmp table | 0.000006 |
| Creating sort index | 0.000010 |
| end | 0.000003 |
| query end | 0.000006 |
| closing tables | 0.000002 |
| removing tmp table | 0.000003 |
| closing tables | 0.000018 |
| freeing items | 0.000038 |
| cleaning up | 0.000015 |
+−−-------------------- + −−--------+
36 rows in set (0.00 sec)

mysql> show profile CPU for query 6;
+−−-------------------- + −−-------- + −−-------- + −−----------+
| Status | Duration | CPU_user | CPU_system |
+−−-------------------- + −−-------- + −−-------- + −−----------+
| starting | 0.000034 | 0.000030 | 0.000004 |
| checking permissions | 0.000006 | 0.000005 | 0.000001 |
| Opening tables | 0.000200 | 0.000150 | 0.000050 |
| checking permissions | 0.000004 | 0.000001 | 0.000003 |
| checking permissions | 0.000002 | 0.000001 | 0.000000 |
| checking permissions | 0.000001 | 0.000001 | 0.000001 |
| checking permissions | 0.000002 | 0.000001 | 0.000001 |
| checking permissions | 0.000001 | 0.000001 | 0.000000 |
| checking permissions | 0.000002 | 0.000001 | 0.000001 |
| checking permissions | 0.000001 | 0.000000 | 0.000001 |
| checking permissions | 0.000047 | 0.000047 | 0.000001 |
| init | 0.000011 | 0.000009 | 0.000001 |
| System lock | 0.000010 | 0.000010 | 0.000001 |
| optimizing | 0.000003 | 0.000001 | 0.000001 |
| optimizing | 0.000016 | 0.000015 | 0.000000 |
| statistics | 0.000064 | 0.000059 | 0.000006 |
| preparing | 0.000020 | 0.000018 | 0.000002 |
| Creating tmp table | 0.000013 | 0.000012 | 0.000001 |
| Sorting for group | 0.000006 | 0.000005 | 0.000001 |
| Sorting result | 0.000003 | 0.000002 | 0.000001 |
| statistics | 0.000004 | 0.000004 | 0.000001 |
| preparing | 0.000004 | 0.000003 | 0.000000 |
| executing | 0.000007 | 0.000006 | 0.000001 |
| Sending data | 0.000006 | 0.000005 | 0.000001 |
| executing | 0.000002 | 0.000001 | 0.000001 |
| Sending data | 0.048299 | 0.048286 | 0.000025 |
| Creating sort index | 0.000033 | 0.000027 | 0.000006 |
| removing tmp table | 0.000006 | 0.000005 | 0.000001 |
| Creating sort index | 0.000010 | 0.000010 | 0.000000 |
| end | 0.000003 | 0.000002 | 0.000001 |
| query end | 0.000006 | 0.000005 | 0.000001 |
| closing tables | 0.000002 | 0.000001 | 0.000000 |
| removing tmp table | 0.000003 | 0.000003 | 0.000001 |
| closing tables | 0.000018 | 0.000018 | 0.000001 |
| freeing items | 0.000038 | 0.000012 | 0.000025 |
| cleaning up | 0.000015 | 0.000013 | 0.000001 |
+−−-------------------- + −−-------- + −−-------- + −−----------+
36 rows in set (0.00 sec)


The last profiling technique included in the MySQL system is the ability to examine how the optimizer performs queries. While not strictly a performance-measuring device, it can be used to diagnose tricky queries that show up in the slow-query log. As a simple example, let’s see what the optimizer predicts about how the following query will be executed:

select * from customer where phone like "%575 %"

This query is not very interesting, and using the LIKE clause with %s surrounding the value is not efficient and almost sure to result in an indexless access method. If you run the command preceded by the EXPLAIN keyword, you see the results of the proposed query optimization. Listing 4-10 shows the results of using the EXPLAIN command.

Listing 4-10. Output of EXPLAIN Command

mysql> explain select * from customer where email like "%575 %" \G
*************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: customer
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 541
Extra: Using where
1 row in set (0.00 sec)

The output shows that the command is a simple select on the customer table, there are no possible keys to use, there are 541 rows in the table, and the optimizer is using the WHERE clause. In this case, it is telling us that the execution will be a simple table scan without indexes—perhaps one of the slowest possible select statements.


In this chapter, I’ve presented a number of software-testing techniques and strategies. You learned about the benefits of software testing and how to leverage test-driven development in your software projects. I also presented the testing facilities available to you for testing MySQL. I showed you the MySQL test and benchmark suites and introduced you to the profiling scripts for MySQL.

The knowledge of these testing facilities will prepare you to ensure your modifications to the MySQL source code are of the highest quality possible. With this knowledge, you are now ready to begin creating extensions and enhancements of the MySQL system that will meet the same high-quality standards that Oracle adheres to.9 Now that you have this information, you can begin to design your solution and include testing early in your design phase.

The next chapter, which begins the second part of this book, introduces you to the most important tool in a developer’s toolbox: debugging!

1 Which normally include large group projects beginning with requirements elicitation.

2 Especially when I announce the next project, in which teams pass their projects to other teams for software testing. It is amazing how many defects they find in other students’ code while insisting their own coding is superior.

3 Yes, this is a bit of a dichotomy, considering agile practices are designed to reduce unnecessary work.

4 D. Rosenberg, M. Stephens, M. Collins-Cope. Agile Development with ICONIX Process (Berkeley, CA: Apress, 2005).

5 Some statisticians consider the statistical engine in Microsoft Excel to be inaccurate. For the values you are likely to see, however, the inaccuracies are not a problem.

6 For more information about software testing, see http://en.wikipedia.org/wiki/Software_testing.

7 Requires ActivePerl, the official Perl distribution for Windows. See http://www.activestate.org for details and to download the latest version.

8 M. Kruckenberg and J. Pipes. Pro MySQL (Berkeley, CA: Apress, 2005).

9 Why else would they have created and made available to you testing, benchmarking, and profiling tools?