Conquering Debugging - C++ Software Engineering - Professional C++ (2014)

Professional C++ (2014)

Part VC++ Software Engineering

Chapter 26Conquering Debugging

WHAT’S IN THIS CHAPTER?

· The Fundamental Law of Debugging and bug taxonomies

· Tips for avoiding bugs

· How to plan for bugs

· The different kinds of memory errors

· How to use a debugger to pinpoint code causing a bug

WROX.COM DOWNLOADS FOR THIS CHAPTER

Please note that all the code examples for this chapter are available as a part of this chapter’s code download on the book’s website at www.wrox.com/go/proc++3e on the Download Code tab.

Your code will contain bugs. Every professional programmer would like to write bug-free code, but the reality is that few software engineers succeed in this endeavor. As computer users know, bugs are endemic in computer software. The software that you write is probably no exception. Therefore, unless you plan to bribe your co-workers into fixing all your bugs, you cannot be a professional C++ programmer without knowing how to debug C++ code. One factor that often distinguishes experienced programmers from novices is their debugging skills.

Despite the obvious importance of debugging, it is rarely given enough attention in courses and books. Debugging seems to be the type of skill that everyone wants you to know, but no one knows how to teach. This chapter attempts to provide concrete debugging guidelines and techniques.

This chapter includes an introduction to the Fundamental Law of Debugging and bug taxonomies, followed by tips for avoiding bugs. Techniques for planning for bugs include error logging, debug traces, assertions, and crash dump. Specific tips are given for debugging the problems that arise, including techniques for reproducing bugs, debugging reproducible bugs, debugging nonreproducible bugs, debugging memory errors, and debugging multithreaded programs. The chapter concludes with a step-by-step debugging example.

THE FUNDAMENTAL LAW OF DEBUGGING

The first rule of debugging is to be honest with yourself and admit that your program will contain bugs. This realistic assessment enables you to put your best effort into preventing bugs from crawling into your program in the first place while you simultaneously include the necessary features to make debugging as easy as possible.

WARNING The Fundamental Law of Debugging: Avoid bugs when you’re coding, but plan for bugs in your code.

BUG TAXONOMIES

A bug in a computer program is incorrect run-time behavior. This undesirable behavior includes both catastrophic and noncatastrophic bugs. Examples of catastrophic bugs are program death, data corruption, operating system failures, or some other horrific outcome. A catastrophic bug can also manifest itself external to the software or computer system running the software; for example, medical software might contain a catastrophic bug causing a massive radiation overdose to a patient. Noncatastrophic bugs are bugs that cause the program to behave incorrectly in more subtle ways; for example, a web browser might return the wrong web page, or a spreadsheet application might calculate the standard deviation of a column incorrectly.

There are also so-called cosmetic bugs, where something is visually not correct, but otherwise works correctly. For example, a button in a user interface is left-enabled when it shouldn’t be, but clicking it does nothing. All computations are perfectly correct, the program does not crash, but it doesn’t look as “nice” as it should.

The underlying cause, or root cause, of the bug is the mistake in the program that causes this incorrect behavior. The process of debugging a program includes both determining the root cause of the bug and fixing the code so that the bug will not occur again.

AVOIDING BUGS

It’s impossible to write completely bug-free code, so debugging skills are important. However, a few tips can help you to minimize the number of bugs:

· Read this book from cover to cover: Learn the C++ language intimately, especially pointers and memory management. Then, recommend this book to your friends and coworkers so they avoid bugs too.

· Design before you code: Designing while you code tends to lead to convoluted designs that are harder to understand and are more error-prone. It also makes you more likely to omit possible edge cases and error conditions.

· Do code reviews: If you wrote a more complicated or risky piece of code, ask a co-worker to review your code. Sometimes it takes a fresh perspective to notice problems.

· Test, test, and test again: Thoroughly test your code, and have others test your code! They are more likely to find problems you haven’t thought of.

· Write automated unit tests: Unit tests are designed to test isolated functionality. You should write unit tests for all implemented features. Run these unit tests automatically as part of your continuous integration setup.

· Expect error conditions, and handle them appropriately: In particular, plan for and handle out-of-memory conditions. They will occur. See Chapter 13.

· Use smart pointers to avoid memory leaks: Smart pointers automatically free resources when they are not needed anymore.

· Don’t ignore compiler warnings: Configure your compiler to compile with a high warning level. Do not blindly ignore warnings. Ideally, you should enable an option in your compiler to treat warnings as errors. This forces you to address each warning.

· Use static code analysis: A static code analyzer helps you to pinpoint problems in your code by analyzing your source code.

· Use good coding style: Strive for readability and clarity, add code comments (not only interface comments), use the override keyword, and so on. This makes it easier for other people to understand your code.

PLANNING FOR BUGS

Your programs should contain features that enable easier debugging when the inevitable bugs arise. This section describes these features and presents sample implementations, where appropriate, that you can incorporate into your own programs.

Error Logging

Imagine this scenario: You have just released a new version of your flagship product, and one of the first users reports that the program “stopped working.” You attempt to pry more information from the user, and eventually discover that the program died in the middle of an operation. The user can’t quite remember what he was doing, or if there were any error messages. How will you debug this problem?

Now imagine the same scenario, but in addition to the limited information from the user, you are also able to examine the error log on the user’s computer. In the log you see a message from your program that says “Error: unable to allocate memory.” Looking at the code near the spot where that error message was generated, you find a line in which you dereferenced a pointer without checking for nullptr. You’ve found the root cause of your bug!

Error logging is the process of writing error messages to persistent storage so that they will be available following an application, or even machine, death. Despite the example scenario, you might still have doubts about this strategy. Won’t it be obvious by your program’s behavior if it encounters errors? Won’t the user notice if something goes wrong? As the preceding example shows, user reports are not always accurate or complete. In addition, many programs, such as the operating system kernel and long-running daemons like inetd or syslogd on Unix, are not interactive and run unattended on a machine. The only way these programs can communicate with users is through error logging. In many cases, a program might also want to automatically recover from certain errors, and hide the error from the user. Still, having logs of those errors available can be invaluable to improve the overall stability of the program.

Thus, your program should log errors as it encounters them. That way, if a user reports a bug, you will be able to examine the log files on the machine to see if your program reported any errors prior to encountering the bug. Unfortunately, error logging is platform dependent: C++ does not contain a standard logging mechanism. Examples of platform-specific logging mechanisms include the syslog facility in Unix and the event reporting API in Windows. You should consult the documentation for your development platform. There are also some open-source implementations of cross-platform logging classes. One example is Boost.Log, available at http://www.boost.org/.

Now that you’re convinced that error logging is a great feature to add to your programs, you might be tempted to log error messages every few lines in your code, so that, in the event of any bug, you’ll be able to trace the code path that was executing. These types of error messages are appropriately called “traces.”

However, you should not write these traces to error logs for two reasons. First, writing to persistent storage is slow. Even on systems that write the logs asynchronously, logging that much information will slow down your program. Second, and most important, most of the information that you would put in your traces is not appropriate for the end user to see. It will just confuse the user, leading to unwarranted service calls. That said, tracing is an important debugging technique under the correct circumstances, as described in the next section.

Here are some specific guidelines for the types of errors your programs should log:

· Unrecoverable errors, such as an inability to allocate memory or a system call failing unexpectedly.

· Errors for which an administrator can take action, such as low memory, an incorrectly formatted data file, an inability to write to disk, or a network connection being down.

· Unexpected errors such as a code path that you never expected to take or variables with unexpected values. Note that your code should “expect” users to enter invalid data and should handle it appropriately. An unexpected error would represent a bug in your program.

· Potential security breaches such as a network connection attempted from an unauthorized address, or too many network connections attempted (denial of service).

It is also useful to log warnings, or recoverable errors, which allow you to investigate if you can possibly avoid them.

Additionally, most APIs allow you to specify a log level or error level, typically error, warning, and info. You can log non-error conditions under a log level that is less severe than “error.” For example, you might want to log significant state changes in your application, or startup and shutdown of the program. You also might consider giving your users a way to adjust the log level of your program at run time so that they can customize the amount of logging that occurs.

Debug Traces

When debugging complicated problems, public error messages generally do not contain enough information. You often need a complete trace of the code path taken, or values of variables before the bug showed up. In addition to basic messages, it’s sometimes helpful to include the following information in debug traces:

· The thread ID, if it’s a multithreaded program

· The name of the function that generates the trace

· The name of the source file in which the code that generates the trace lives

You can add this tracing to your program through a special debug mode, or via a ring buffer. Both of these methods are explained in detail in the next sections. Note that in multithreaded programs you have to make your trace logging thread-safe. See Chapter 23 for details on multithreaded programming.

NOTE Be careful with logging too much detail. You don’t want to leak intellectual property through your log files.

Debug Mode

The first technique to add debug traces is to provide a debug mode for your program. In debug mode, the program writes trace output to standard error or to a file, and perhaps does extra checking during execution. There are several ways to add a debug mode to your program.

Start-Time Debug Mode

Start-time debug mode allows your application to run with or without debug mode depending on a command-line argument. This strategy includes the debug code in the “release” binary, and allows debug mode to be enabled at a customer site. However, it does require users to restart the program in order to run it in debug mode, which may prevent you from obtaining useful information about certain bugs.

The following example is a simple program implementing a start-time debug mode. This program doesn’t do anything useful; it is only for demonstrating the technique.

All logging functionality is wrapped in a Logger class. This class has two static data members: The name of the log file, and a Boolean saying whether or not logging is enabled or disabled. The class has a static public log() variadic template method. Variadic templates are discussed in Chapter 21. Note that the log file is opened, flushed, and closed on each call to log(). This might lower performance a bit; however, it does guarantee correct logging, which is more important.

class Logger

{

public:

static void enableLogging(bool enable) { msLoggingEnabled = enable; }

static bool isLoggingEnabled() { return msLoggingEnabled; }

template<typename... Args>

static void log(const Args&... args)

{

if (!msLoggingEnabled)

return;

ofstream ofs(msDebugFileName, ios_base::app);

if (ofs.fail()) {

cerr << "Unable to open debug file!" << endl;

return;

}

logHelper(ofs, args...);

ofs << endl;

}

private:

template<typename T1>

static void logHelper(ofstream& ofs, const T1& t1)

{

ofs << t1;

}

template<typename T1, typename... Tn>

static void logHelper(ofstream& ofs, const T1& t1, const Tn&... args)

{

ofs << t1;

logHelper(ofs, args...);

}

static const char* msDebugFileName;

static bool msLoggingEnabled;

};

const char* Logger::msDebugFileName = "debugfile.out";

bool Logger::msLoggingEnabled = false;

The following helper macro is defined to make it easy to log something. It uses __func__, defined by the C++ standard, a predefined variable that contains the name of the current function. If your compiler doesn’t support this standard __func__ yet, you should check its documentation. Maybe it supports __FUNCTION__ instead, or something similar.

#define log(...) Logger::log(__func__, "(): ", __VA_ARGS__)

This macro replaces every call to log() in your code with a call to Logger::log(). The macro automatically includes the function name as first argument to Logger::log(). For example:

log("given argument: ", *obj);

The log() macro replaces this with:

Logger::log(__func__, "(): ", "given argument: ", *obj);

Start-time debug mode needs to parse the command-line arguments to find out if it should enable debug mode or not. Unfortunately, there is no standard library functionality in C++ for parsing command-line arguments. This program uses a simple isDebugSet()function to check for the debug flag among all the command-line arguments, but a function to parse all command-line arguments would need to be more sophisticated.

bool isDebugSet(int argc, char* argv[])

{

for (int i = 0; i < argc; i++) {

if (strcmp(argv[i], "-d") == 0) {

return true;

}

}

return false;

}

Some arbitrary test code is used to exercise the debug mode in this example. Two classes are defined, ComplicatedClass and UserCommand. Both classes define an operator<< to write instances of them to a stream, because the Logger class uses this operator to dump objects to the log.

class ComplicatedClass

{

public:

ComplicatedClass() {}

};

ostream& operator<<(ostream& ostr, const ComplicatedClass& src)

{

ostr << "ComplicatedClass";

return ostr;

}

class UserCommand

{

public:

UserCommand() {}

};

ostream& operator<<(ostream& ostr, const UserCommand& src)

{

ostr << "UserCommand";

return ostr;

}

Here is some test code with a number of log calls:

UserCommand getNextCommand(ComplicatedClass* obj)

{

UserCommand cmd;

return cmd;

}

void processUserCommand(UserCommand& cmd)

{

// details omitted for brevity

}

void trickyFunction(ComplicatedClass* obj)

{

log("given argument: ", *obj);

for (size_t i = 0; i < 100; ++i) {

UserCommand cmd = getNextCommand(obj);

log("retrieved cmd ", i, ": ", cmd);

try {

processUserCommand(cmd);

} catch (const exception& e) {

log("received exception from processUserCommand(): ", e.what());

}

}

}

int main(int argc, char* argv[])

{

Logger::enableLogging(isDebugSet(argc, argv));

if (Logger::isLoggingEnabled()) {

// Print the command-line arguments to the trace

for (int i = 0; i < argc; i++) {

log(argv[i]);

}

}

ComplicatedClass obj;

trickyFunction(&obj);

// Rest of the function not shown

return 0;

}

There are two ways to run this application:

> STDebug

> STDebug -d

Debug mode is activated only when the -d argument is specified on the command line.

WARNING Macros in C++ should be avoided as much as possible because they can be hard to debug. However, for logging purposes, using a simple macro is acceptable and it makes using the logging code much easier.

Compile-Time Debug Mode

Instead of enabling or disabling debug mode through a command-line argument, you could also use a preprocessor symbol such as DEBUG_MODE and #ifdefs to selectively compile the debug code into your program. In order to generate a debug version of this program, you would have to compile it with the symbol DEBUG_MODE defined. Your compiler should allow you to define symbols during compilation; consult your compiler’s documentation for details. For example, GCC allows you to specify –Dsymbol through the command-line; and Microsoft VC++ allows you to specify the symbols through the Visual Studio IDE, or if you use the VC++ command-line, /D symbol.

The advantage of this method is that your debug code is not compiled into the “release” binary, and so does not increase its size. The disadvantage is that there is no way to enable debugging at a customer site for testing or following the discovery of a bug.

An example implementation is given in CTDebug.cpp in the downloadable source code archive.

Run-Time Debug Mode

The most flexible way to provide a debug mode is to allow it to be enabled or disabled at run time. One way to provide this feature is to supply an asynchronous interface that controls debug mode on the fly. In a GUI program, this interface could take the form of a menu command. In a CLI (Command Line Interface) program, this interface could be an asynchronous command that makes an interprocess call into the program (using sockets, signals, or remote procedure calls, for example). C++ provides no standard way to perform interprocess communication or GUIs, so an example of this technique is not shown.

Ring Buffers

Debug mode is useful for debugging reproducible problems and for running tests. However, bugs often appear when the program is running in non-debug mode, and by the time you or the customer enables debug mode, it is too late to gain any information about the bug. One solution to this problem is to enable tracing in your program at all times. You usually need only the most recent traces to debug a program, so you should store only the most recent traces at any point in a program’s execution. One way to provide this limitation is through careful use of log file rotations.

However, for performance reasons, it is better that your program doesn’t log these traces continuously to disk. Instead, it should store them in memory and provide a mechanism to dump all the trace messages to standard error or to a log file if the need arises.

A common technique is to use a ring buffer to store a fixed number of messages, or messages in a fixed amount of memory. When the buffer fills up, it starts writing messages at the beginning of the buffer again, overwriting the older messages. This cycle can repeat indefinitely. The following sections provide an implementation of a ring buffer and show you how you can use it in your programs.

Ring Buffer Interface

The following RingBuffer class provides a simple debug ring buffer. The client specifies the number of entries in the constructor and adds messages with the addEntry() method. Once the number of entries exceeds the number allowed, new entries overwrite the oldest entries in the buffer. The buffer also provides the option to output entries to a stream as they are added to the buffer. The client can specify an output stream in the constructor, and can reset it with the setOutput() method. Finally, the operator<< streams the entire buffer to an output stream. This implementation uses a variadic template method. Variadic templates are discussed in Chapter 21.

class RingBuffer

{

public:

// Constructs a ring buffer with space for numEntries.

// Entries are written to *ostr as they are queued (optional).

RingBuffer(size_t numEntries = kDefaultNumEntries,

std::ostream* ostr = nullptr);

virtual ~RingBuffer();

// Adds the string to the ring buffer, possibly overwriting the

// oldest string in the buffer (if the buffer is full).

template<typename... Args>

void addEntry(const Args&... args)

{

std::ostringstream os;

addEntryHelper(os, args...);

addStringEntry(os.str());

}

// Streams the buffer entries, separated by newlines, to ostr.

friend std::ostream& operator<<(std::ostream& ostr, RingBuffer& rb);

// Sets the output stream to which entries are streamed as they are added.

// Returns the old output stream.

std::ostream* setOutput(std::ostream* newOstr);

private:

std::vector<std::string> mEntries;

std::vector<std::string>::iterator mNext;

std::ostream* mOstr;

bool mWrapped;

static const size_t kDefaultNumEntries = 500;

template<typename T1>

void addEntryHelper(std::ostringstream& os, const T1& t1)

{

os << t1;

}

template<typename T1, typename... Tn>

void addEntryHelper(std::ostringstream& os, const T1& t1,

const Tn&... args)

{

os << t1;

addEntryHelper(os, args...);

}

void addStringEntry(std::string&& entry);

};

Ring Buffer Implementation

This implementation of the ring buffer stores a fixed number of string objects. This approach certainly is not the most efficient solution. Other possibilities would be to provide a fixed number of bytes of memory for the buffer. However, this implementation should be sufficient unless you’re writing a high-performance application.

For multithreaded programs it’s useful to add the ID of the thread and a timestamp to each trace entry. Of course, the ring buffer has to be made thread-safe before using it in a multithreaded application. See Chapter 23 for multithreaded programming.

// Initialize the vector to hold exactly numEntries. The vector size

// does not need to change during the lifetime of the object.

// Initialize the other members.

RingBuffer::RingBuffer(size_t numEntries, ostream* ostr) : mEntries(numEntries),

mNext(begin(mEntries)), mOstr(ostr), mWrapped(false)

{

}

RingBuffer::~RingBuffer()

{

}

// The addEntry algorithm is pretty simple: add the entry to the next

// free spot, then reset mNext to indicate the free spot after

// that. If mNext reaches the end of the vector, it starts over at 0.

//

// The buffer needs to know if the buffer has wrapped or not so

// that it knows whether to print the entries past mNext in operator<<

void RingBuffer::addStringEntry(string&& entry)

{

// If there is a valid ostream, write this entry to it.

if (mOstr) {

*mOstr << entry << endl;

}

// Move the entry to the next free spot and increment

// mNext to point to the free spot after that.

*mNext = std::move(entry);

++mNext;

// Check if we've reached the end of the buffer. If so, we need to wrap.

if (mNext == end(mEntries)) {

mNext = begin(mEntries);

mWrapped = true;

}

}

// Set the output stream.

ostream* RingBuffer::setOutput(ostream* newOstr)

{

ostream* ret = mOstr;

mOstr = newOstr;

return ret;

}

// operator<< uses an ostream_iterator to "copy" entries directly

// from the vector to the output stream.

//

// operator<< must print the entries in order. If the buffer has wrapped,

// the earliest entry is one past the most recent entry, which is the entry

// indicated by mNext. So first print from entry mNext to the end.

//

// Then (even if the buffer hasn't wrapped) print from the beginning to mNext-1.

ostream& operator<<(ostream& ostr, RingBuffer& rb)

{

if (rb.mWrapped) {

// If the buffer has wrapped, print the elements from

// the earliest entry to the end.

copy(rb.mNext, end(rb.mEntries), ostream_iterator<string>(ostr, "\n"));

}

// Now, print up to the most recent entry.

// Go up to mNext because the range is not inclusive on the right side.

copy(begin(rb.mEntries), rb.mNext, ostream_iterator<string>(ostr, "\n"));

return ostr;

}

Using the Ring Buffer

In order to use the ring buffer, you can create an instance of it and start adding messages to it. When you want to print the buffer, just use operator<< to print it to the appropriate ostream. Here is the earlier start-time debug mode program modified to show use of a ring buffer instead. Changes are highlighted. The definitions of the ComplicatedClass and UserCommand classes, and the functions getNextCommand() and processUserCommand() are not shown. They are identical as before.

RingBuffer debugBuf;

void trickyFunction(ComplicatedClass* obj)

{

// Trace log the values with which this function starts.

debugBuf.addEntry(__func__, "(): given argument: ", *obj);

for (size_t i = 0; i < 100; ++i) {

UserCommand cmd = getNextCommand(obj);

debugBuf.addEntry(__func__, "(): retrieved cmd ", i, ": ", cmd);

try {

processUserCommand(cmd);

} catch (const exception& e) {

debugBuf.addEntry(__func__,

"(): received exception from processUserCommand():", e.what());

}

}

}

int main(int argc, char* argv[])

{

// Print the command-line arguments

for (int i = 0; i < argc; i++) {

debugBuf.addEntry(argv[i]);

}

ComplicatedClass obj;

trickyFunction(&obj);

// Print the current contents of the debug buffer to cout

cout << debugBuf;

return 0;

}

Displaying the Ring Buffer Contents

Storing trace debug messages in memory is a great start, but in order for them to be useful, you need a way to access these traces for debugging.

Your program should provide a “hook” to tell it to export the messages. This hook could be similar to the interface you would use to enable debugging at run time. Additionally, if your program encounters a fatal error that causes it to exit, it could export the ring buffer automatically to a log file before exiting.

Another way to retrieve these messages is to obtain a memory dump of the program. Each platform handles memory dumps differently, so you should consult a book or expert on your platform.

Assertions

The <cassert> header defines an assert macro. It takes a Boolean expression and, if the expression evaluates to false, prints an error message and terminates the program. If the expression evaluates to true, it does nothing.

WARNING Normally, you should avoid any library functions or macros that can terminate your program. The assert macro is an exception. If an assertion triggers, it means that some assumption is wrong or that something is catastrophically, unrecoverably wrong, and the only sane thing to do is to terminate the application at that very moment, instead of continuing.

Assertions allow you to “force” your program to exhibit a bug at the exact point where that bug originates. If you didn’t assert at that point, your program might proceed with those incorrect values, and the bug might not show up until much later. Thus, assertions allow you to detect bugs earlier than you otherwise would.

NOTE The behavior of the standard assert macro depends on the NDEBUG preprocessor symbol: If the symbol is not defined, the assertion takes place, otherwise it is ignored. Compilers often define this symbol when compiling “release” builds. If you want to leave assertions in release builds, you may have to change your compiler settings, or write your own version of assert that isn’t affected by the value of NDEBUG.

You could use assertions in your code whenever you are “assuming” something about the state of your variables. For example, if you call a library function that is supposed to return a pointer and claims never to return nullptr, throw in an assert after the function call to make sure that the pointer isn’t nullptr.

Note that you should assume as little as possible. For example, if you are writing a library function, don’t assert that the parameters are valid. Instead, check the parameters and return an error code or throw an exception if they are invalid.

As a rule, assertions should only be used for cases that are truly problematic, and should therefore never be ignored when occurring during development. If you hit an assertion during development, fix it, don’t just disable the assertion.

WARNING Be careful not to put any code that must be executed for correct program functioning inside assertions. For example, a line like this is probably asking for trouble: assert(myFunctionCall() != nullptr). If a release build of your code strips assertions, then the call to myFunctionCall() is stripped as well.

Static Assertions

The assertions discussed in the previous section are evaluated at run time. static_assert allows assertions evaluated at compile time. A static_assert requires two parameters: an expression to evaluate and a string. When the expression evaluates to false, the compiler issues an error that contains the given string. A simple example is to check INT_MAX:

static_assert(INT_MAX >= 0x7FFFFFFF,

"Code requires INT_MAX to be at least 0x7FFFFFFF.");

If you compile this with a compiler where INT_MAX is less than 0x7FFFFFFF, the compiler issues an error that could look as follows:

test.cpp(3): error C2338: Code requires INT_MAX to be at least 0x7FFFFFFF.

Another example where static_asserts are pretty powerful is in combination with type traits. Type traits are discussed in Chapter 21. For example, if you write a function template or class template, you could use static_asserts together with type traits to issue compiler errors when template types don’t satisfy certain conditions. The following example requires that the template type for process() has Base1 as its base class:

#include <type_traits>

using namespace std;

class Base1 {};

class Base1Child : public Base1 {};

class Base2 {};

class Base2Child : public Base2 {};

template<typename T>

void process(const T& t)

{

static_assert(is_base_of<Base1, T>::value, "Base1 should be a base for T.");

}

int main()

{

process(Base1());

process(Base1Child());

//process(Base2()); // Error

//process(Base2Child()); // Error

return 0;

}

If you try to call process() with an instance of Base2 or Base2Child, the compiler issues an error that could look as follows:

test.cpp(13): error C2338: Base1 should be a base for T.

test.cpp(21) : see reference to function template

instantiation 'void process<Base2>(const T &)' being compiled

with

[

T=Base2

]

Crash Dumps

Make sure your program creates crash dumps, also called memory dumps, core dumps, and so on. How you create such dumps is platform dependent, so you should consult the documentation of your platform.

Also make sure you set up a symbol server and a source code server. The symbol server is used to store debugging symbols of released binaries of your software. These symbols are used later on to interpret crash dumps from customers. The source code server, discussed in Chapter 24, stores all revisions of your source code. When debugging crash dumps, this source code server is used to download the correct source code for the revision of your software that created the crash dump.

The exact procedure of analyzing crash dumps depends on your platform and compiler, so consult their documentation.

From personal experience: A crash dump is often worth more than a thousand bug reports.

DEBUGGING TECHNIQUES

Debugging a program can be incredibly frustrating. However, with a systematic approach it becomes significantly easier. Your first step in trying to debug a program should always be to reproduce the bug. Depending on whether or not you can reproduce the bug, your subsequent approach will differ. The next four sections explain how to reproduce bugs, how to debug reproducible bugs, how to debug nonreproducible bugs, and how to debug regressions. Additional sections explain details about debugging memory errors and debugging multithreaded programs.

Reproducing Bugs

If you can reproduce the bug consistently, it will be much easier to determine the root cause. Finding the root cause of bugs that are not reproducible is difficult, if not impossible.

As a first step to reproduce the bug, run the program with exactly the same inputs as the run when the bug first appeared. Be sure to include all inputs, from the program’s startup to the time of the bug’s appearance. A common mistake is to attempt to reproduce the bug by performing only the triggering action. This technique may not reproduce the bug because the bug might be caused by an entire sequence of actions.

For example, if your web browser program dies when you request a certain web page, it may be due to memory corruption triggered by that particular request’s network address. On the other hand, it may be because your program records all requests in a queue, with space for one million entries, and this entry was number one million and one. Starting the program over and sending one request certainly wouldn’t trigger the bug in that case.

Sometimes it is impossible to emulate the entire sequence of events that leads to the bug. Perhaps the bug was reported by someone who can’t remember everything that he or she did. Alternatively, maybe the program was running for too long to emulate every input. In that case, do your best to reproduce the bug. It takes some guesswork, and can be time-consuming, but effort at this point will save time later in the debugging process. Here are some techniques you can try:

· Repeat the triggering action in the correct environment and with as many inputs as possible similar to the initial report.

· Do a quick review of the code related to the bug. More often than not, you’ll find a likely cause that will guide you in reproducing the problem.

· Run automated tests that exercise similar functionality. Reproducing bugs is one benefit of automated tests. If it takes 24 hours of testing before the bug shows up, it’s preferable to let those tests run on their own rather than spend 24 hours of your time trying to reproduce it.

· If you have the necessary hardware available, running slight variations of tests concurrently on different machines can sometimes save time.

· Run stress tests that exercise similar functionality. If your program is a web server that died on a particular request, try running as many browsers as possible simultaneously that make that request.

After you are able to reproduce the bug consistently, you should attempt to find the smallest sequence that triggers the bug. You can start with the minimum sequence, containing only the triggering action, and slowly expand the sequence to cover the entire sequence from startup until the bug is triggered. This will result in the simplest and most efficient test case to reproduce it, which makes it simpler to find the root cause of the problem, and it’s easier to verify the fix.

Debugging Reproducible Bugs

When you can reproduce a bug consistently and efficiently, it’s time to figure out the problem in the code that causes the bug. Your goal at this point is to find the exact lines of code that trigger the problem. You can use two different strategies:

1. Logging debug messages: By adding enough debug messages to your program and watching its output when you reproduce the bug, you should be able to pinpoint the exact lines of code where the bug occurs. If you have a debugger at your disposal, adding debug messages is usually not recommended because it requires modifications to the program and can be time-consuming. However, if you have already instrumented your program with debug messages as described earlier, you might be able to find the root cause of your bug by running your program in debug mode while reproducing the bug. Note that bugs sometimes disappear simply by enabling logging because the act of enabling logging potentially changes the timings of your application slightly.

2. Using a debugger: Debuggers allow you to step through the execution of your program and to view the state of memory and the values of variables at various points. They are often indispensable tools for finding the root cause of bugs. When you have access to the source code, you will use a symbolic debugger: a debugger that utilizes the variable names, class names, and other symbols in your code. In order to use a symbolic debugger you must instruct your compiler to generate debug symbols.

The debugging example at the end of this chapter demonstrates both these approaches.

Debugging Nonreproducible Bugs

Fixing bugs that are not reproducible is significantly more difficult than fixing reproducible bugs. You often have very little information and must employ a lot of guesswork. Nevertheless, a few strategies can aid you.

1. Try to turn a nonreproducible bug into a reproducible bug. By using educated guesses, you can often determine approximately where the bug lies. It’s worthwhile to spend some time trying to reproduce the bug. Once you have a reproducible bug you can figure out its root cause by using the techniques described earlier.

2. Analyze error logs. Easily done if you have instrumented your program with error log generation, as described earlier. You should sift through this information because any errors that were logged directly before the bug occurred are likely to have contributed to the bug itself. If you’re lucky (or if you coded your program well), your program will have logged the exact reason for the bug at hand.

3. Obtain and analyze traces. Again, easily done if you have instrumented your program with tracing output, for example via a ring buffer as described earlier. At the time of the bug’s occurrence, you hopefully obtained a copy of the traces. These traces should lead you right to the location of the bug in your code.

4. Examine a memory dump file, if it exists. Some platforms generate memory dump files of applications that terminate abnormally. On Unix and Linux these memory dumps are called core files. Each platform provides tools for analyzing these memory dumps. They can, for example, be used to generate a stack trace of the application, or to view the contents of its memory before the application died.

5. Inspect the code. Unfortunately, this is often the only strategy to determine the cause of a nonreproducible bug. Surprisingly, it often works. When you examine code, even code that you wrote yourself, with the perspective of the bug that just occurred, you can often find mistakes that you overlooked previously. I don’t recommend spending hours staring at your code, but tracing through the code path by hand can often lead you directly to the problem.

6. Use a memory-watching tool, such as one of those described in the “Debugging Memory Problems” section, which follows. Such tools often alert you to memory errors that don’t always cause your program to misbehave, but could potentially be the cause of the bug at hand.

7. File or update a bug report. Even if you can’t find the root cause of the bug right away, the report will be a useful record of your attempts if the problem is encountered again.

Once you have found the root cause of a nonreproducible bug, you should create a reproducible test case and move it to the “reproducible bugs” category. It is important to be able to reproduce a bug before you actually fix it. Otherwise, how will you test the fix? A common mistake when debugging nonreproducible bugs is to fix the wrong problem in the code. Because you can’t reproduce the bug, you don’t know if you’ve really fixed it, so don’t be surprised when it shows up again a month later.

Debugging Regressions

If a feature contains a regression bug, it means that the feature used to work correctly, but stopped working due to the introduction of a bug.

A useful debugging technique for investigating regressions is to have a look at the change log of relevant files. If you know at what time the feature was still working, look at all the change logs since that time. You might notice something suspicious that could lead you to the root cause.

Another approach that can save you a lot of time debugging regressions is to use a binary search approach with older versions of the software to try and figure out when it started to go wrong. You can use binaries of older versions if you keep those, or revert the source code to an older revision with your source code server. Once you know when it started to go wrong, inspect the change logs to see what has changed at that time. This mechanism is only possible when you can reproduce the bug.

Debugging Memory Problems

Most catastrophic bugs, such as application death, are caused by memory errors. Many noncatastrophic bugs are triggered by memory errors as well. Some memory bugs are obvious. If your program attempts to dereference a nullptr pointer, the default action is to terminate the program. However, nearly every platform gives you the capability of responding to catastrophic errors and taking remedial action. The amount of effort you devote to this depends on the importance of this kind of recovery to your end users. For example, a text editor really needs to make a best-attempt to save the modified buffers (possibly under a “recovered” name), while for other programs, users can find the default behavior acceptable, even if it is unpleasant.

Some memory bugs are more insidious. If you write past the end of an array in C++, your program will probably not crash directly at that point. However, if that array was on the stack, you may have written into a different variable or array, changing values that won’t show up until later in the program. Alternatively, if the array was on the heap, you could cause memory corruption in the heap, which will cause errors later when you attempt to allocate or free more memory dynamically.

Chapter 22 introduces some of the common memory errors from the perspective of what to avoid when you’re coding. This section discusses memory errors from the perspective of identifying problems in code that exhibits bugs. You should be familiar with the discussion in Chapter 22 before continuing with this section.

WARNING Most of the following memory problems can be avoided by using smart pointers instead of dumb pointers.

Categories of Memory Errors

In order to debug memory problems you should be familiar with the types of errors that can occur. This section describes the major categories of memory errors. Each memory error includes a small code example demonstrating the error and a list of possiblesymptoms that you might observe. Note that a symptom is not the same thing as a bug itself: A symptom is an observable behavior caused by a bug.

Memory-Freeing Errors

The following table summarizes five major errors involving freeing memory.

ERROR TYPE

SYMPTOMS

EXAMPLE

Memory leak

Process memory usage grows over time.
Process runs slower over time.
Eventually, operations and system calls fail because of lack of memory.

void memoryLeak()

{

int* p = new int[1000];

return; // Bug! Not freeing p.

}

Using mismatched allocation and free operations

Does not usually cause a crash immediately.
Can cause memory corruption on some platforms, which might show up as a crash later in the program.
Certain mismatches can also cause memory leaks.

void mismatchedFree()

{

int* p1 = (int*)malloc(sizeof(int));

int* p2 = new int;

int* p3 = new int[1000];

delete p1; // BUG! Should use free

delete[] p2; // BUG! Should use delete

free(p3); // BUG! Should use delete[]

}

Freeing memory more than once

Can cause a crash if the memory at that location has been handed out in another allocation between the two calls to delete.

void doubleFree()

{

int* p1 = new int[1000];

delete[] p1;

int* p2 = new int[1000];

delete[] p1; // BUG! freeing p1 twice

} // BUG! Leaking memory of p2

Freeing unallocated memory

Will usually cause a crash.

void freeUnallocated()

{

int* p = reinterpret_cast<int*>(10000);

delete p; // BUG! p not a valid pointer.

}

Freeing stack memory

Technically a special case of freeing unallocated memory. Will usually cause a crash.

void freeStack()

{

int x;

int* p = &x;

delete p; // BUG! Freeing stack memory

}

The crashes mentioned in this table can have different manifestations depending on your platform, such as segmentation faults, bus errors, or access violations.

As you can see, some of the errors do not cause immediate program termination. These bugs are more subtle, leading to problems later in the run of the program.

Memory-Access Errors

The second category of memory errors involves the actual reading and writing of memory.

ERROR TYPE

SYMPTOMS

EXAMPLE

Accessing invalid memory

Almost always causes program to crash immediately.

void accessInvalid()

{

int* p = reinterpret_cast<int*>(10000);

*p = 5; // BUG! p is not a valid pointer.

}

Accessing freed memory

Does not usually cause a program crash.
If the memory has been handed out in another allocation, can cause “strange” values to appear unexpectedly.

void accessFreed()

{

int* p1 = new int;

delete p1;

int* p2 = new int;

*p1 = 5; // BUG! The memory pointed to

// by p1 has been freed.

}

Accessing memory in a different allocation

Does not cause a crash.
Can cause “strange” and potentially dangerous values to appear unexpectedly.

void accessElsewhere()

{

int x, y[10], z;

x = 0;

z = 0;

for (int i = 0; i <= 10; i++) {

y[i] = 5; // BUG for i==10! element 10

// is past end of array.

}

}

Reading uninitialized memory

Does not cause a crash unless you use the uninitialized value as a pointer and dereference it (as in the example). Even then, it will not always cause a crash.

void readUninitialized()

{

int* p;

cout << *p; // BUG! p is uninitialized

}

Memory-access errors don’t always cause a crash. They can instead lead to subtle errors, in which the program does not terminate but instead produces erroneous results. Erroneous results can lead to serious consequences; for example, when external devices (such as robotic arms, X-ray machines, radiation treatments, life support systems, etc.) are being controlled by the computer.

Note that the discussed symptoms for memory-freeing errors and memory-access errors are the default symptoms for release builds of your program. Debug builds will most likely behave differently, and when run inside a debugger, the debugger might break into the code when an error occurs.

Tips for Debugging Memory Errors

Memory-related bugs often show up in slightly different places in the code each time you run the program. This is usually the case with heap memory corruption. Heap memory corruption is like a time bomb, ready to explode at some attempt to allocate, free, or use memory on the heap. So, when you see a bug that is reproducible, but shows up in slightly different places, suspect memory corruption.

If you suspect a memory bug, your best option is to use a memory-checking tool for C++. Debuggers often provide options to run the program while checking for memory errors. Additionally, there are some excellent third-party tools such as purify from Rational Software (now owned by IBM) or valgrind for Linux (discussed in Chapter 22). Microsoft provides a free download called Application Verifier, which can be used in a Windows environment. It is a run-time verification tool to help you find subtle programming errors like the previously discussed memory errors. These debuggers and tools work by interposing their own memory-allocation and -freeing routines in order to check for any misuse of dynamic memory, such as freeing unallocated memory, dereferencing unallocated memory, or writing off the end of an array.

If you don’t have a memory-checking tool at your disposal, and the normal strategies for debugging are not helping, you may need to resort to code inspection. First, narrow down the part of the code containing the bug. Then, as a general rule, look at all naked pointers. Provided that you work on moderate to good quality code, most pointers should already be wrapped in smart pointers. If you do encounter naked pointers, take a closer look at how they are used, because they might be the cause of the error. Here are some more items to look for in your code.

Object and Class-related Errors

· Verify that your classes with dynamically allocated memory have destructors that free exactly the memory that’s allocated in the object: no more, and no less.

· Ensure that your classes handle copying and assignment correctly with copy constructors and assignment operators, as described in Chapter 8. Make sure move constructors and move assignment operators properly set pointers in the source object to nullptr so that their destructors don’t try to free that memory.

· Check for suspicious casts. If you are casting a pointer to an object from one type to another, make sure that it’s valid. When possible, use dynamic_casts.

General Memory Errors

· Make sure that every call to new is matched with exactly one call to delete. Similarly, every call to malloc, alloc, or calloc should be matched with one call to free. And every call to new[] should be matched with one call to delete[]. To avoid freeing memory multiple times or using freed memory, it’s recommended to set your pointer to nullptr after freeing its memory.

· Check for buffer overruns. Anytime you iterate over an array or write into or read from a C-style string, verify that you are not accessing memory past the end of the array. These problems can often be avoided by using STL containers and strings.

· Check for dereferencing of invalid pointers.

· When declaring a pointer on the stack, make sure you always initialize it as part of its declaration; for example: T* p = nullptr; or T* p = new T; but never: T* p;

· Similarly, make sure your classes always initialize pointer data members in their constructors, by either allocating memory in the constructor or setting the pointers to nullptr.

Debugging Multithreaded Programs

C++ includes a threading library that provides mechanisms for threading and synchronization between threads. This threading library is discussed in Chapter 23. Multithreaded C++ programs are common, so it is important to think about the special issues involved in debugging a multithreaded program. Bugs in multithreaded programs are often caused by variations in timings in the operating system scheduling, and can be difficult to reproduce. Thus, debugging multithreaded programs takes a special set of techniques:

1. Use a debugger: A debugger makes it relatively easy to diagnose certain multithreaded problems; for example, deadlocks. When the deadlock appears, break into the debugger and inspect the different threads. You will be able to see which threads are blocked and on which line in the code they are blocked. Combining this with trace logs that show you how you came into the deadlock situation should be enough to fix deadlocks.

2. Use log-based debugging: When debugging multithreaded programs, log-based debugging can sometimes be more effective than using a debugger to debug certain problems. You can add log statements to your program before and after critical sections, and before acquiring and after releasing locks. Log-based debugging is extremely useful to investigate race conditions. However, the act of adding log statements slightly changes run-time timings, which might hide the bug.

3. Insert forced sleeps and context switches: If you are having trouble reproducing the problem consistently, or have a hunch about the root cause but want to verify it, you can force certain thread-scheduling behavior by making your threads sleep for specified amounts of time. The <thread> header defines sleep_until() and sleep_for() in the std::this_thread namespace, which you can use to sleep. Sleeping for several seconds right before releasing a lock, immediately before signaling a condition variable, or directly before accessing shared data can reveal race conditions that would otherwise go undetected. If this debugging technique reveals the root cause, it must be fixed, so that it works correctly after removing these forced sleeps and context switches. Leaving these forced sleeps and context switches in place to “fix” the problem is wrong.

4. Perform code review: Reviewing your thread synchronization code often helps in fixing race conditions. Try to proof over and over that what happened is not possible, until you see how it is. It doesn’t hurt to write down these “proofs” in code comments. Also, ask a coworker to do pair debugging; he or she might see something you are overlooking.

Debugging Example: Article Citations

This section presents a buggy program and shows you the steps to take in order to debug it and fix the problem.

Suppose that you’re part of a team writing a web page that allows users to search for the research articles that cite a particular paper. This type of service is useful for authors who are trying to find work similar to their own. Once they find one paper representing a related work, they can look for every paper that cites that one to find other related work.

In this project, you are responsible for the code that reads the raw citations data from text files. For simplicity, assume that the citation info for each paper is found in its own file. Furthermore, assume that the first line of each file contains the author, title, and publication info for the paper; the second line is always empty; and all subsequent lines contain the citations from the article (one on each line). Here is an example file for one of the most important papers in computer science:

Alan Turing, "On Computable Numbers, with an Application to the Entscheidungsproblem", Proceedings of the London Mathematical Society, Series 2, Vol.42 (1936-37), 230-265.

Gödel, "Über formal unentscheidbare Sätze der Principia Mathematica und verwandter Systeme, I", Monatshefte Math. Phys., 38 (1931), 173-198.

Alonzo Church. "An unsolvable problem of elementary number theory", American J. of Math., 58 (1936), 345-363.

Alonzo Church. "A note on the Entscheidungsproblem", J. of Symbolic Logic, 1 (1936), 40-41.

E.W. Hobson, "Theory of functions of a real variable (2nd ed., 1921)", 87-88.

Buggy Implementation of an ArticleCitations Class

You decide to structure your program by writing an ArticleCitations class that reads the file and stores the information. This class stores the article info from the first line in one string, and the citations’ info in an array of strings. Please note that this design decision is a bad one. You should opt for one of the STL containers to store the citations. There are other obvious issues with this implementation, such as using int instead of size_t. However, for the purpose of illustrating buggy applications, it’s perfect. The class definition looks like this:

class ArticleCitations

{

public:

ArticleCitations(const std::string& fileName);

virtual ~ArticleCitations();

ArticleCitations(const ArticleCitations& src);

ArticleCitations& operator=(const ArticleCitations& rhs);

const std::string& getArticle() const { return mArticle; }

int getNumCitations() const { return mNumCitations; }

const std::string& getCitation(int i) const { return mCitations[i]; }

private:

void readFile(const std::string& fileName);

void copy(const ArticleCitations& src);

std::string mArticle;

std::string* mCitations;

int mNumCitations;

};

The implementation follows. This program is buggy! Don’t use it verbatim or as a model.

ArticleCitations::ArticleCitations(const string& fileName)

: mCitations(nullptr), mNumCitations(0)

{

// All we have to do is read the file.

readFile(fileName);

}

ArticleCitations::ArticleCitations(const ArticleCitations& src)

{

copy(src);

}

ArticleCitations& ArticleCitations::operator=(const ArticleCitations& rhs)

{

// Check for self-assignment.

if (this == &rhs) {

return *this;

}

// Free the old memory.

delete [] mCitations;

// Copy the data

copy(rhs);

return *this;

}

void ArticleCitations::copy(const ArticleCitations& src)

{

// Copy the article name, author, etc.

mArticle = src.mArticle;

// Copy the number of citations

mNumCitations = src.mNumCitations;

// Allocate an array of the correct size

mCitations = new string[mNumCitations];

// Copy each element of the array

for (int i = 0; i < mNumCitations; i++) {

mCitations[i] = src.mCitations[i];

}

}

ArticleCitations::~ArticleCitations()

{

delete [] mCitations;

}

void ArticleCitations::readFile(const string& fileName)

{

// Open the file and check for failure.

ifstream istr(fileName.c_str());

if (istr.fail()) {

throw invalid_argument("Unable to open file");

}

// Read the article author, title, etc. line.

getline(istr, mArticle);

// Skip the white space before the citations start.

istr >> ws;

int count = 0;

// Save the current position so we can return to it.

ios_base::streampos citationsStart = istr.tellg();

// First count the number of citations.

while (!istr.eof()) {

// Skip white space before the next entry.

istr >> ws;

string temp;

getline(istr, temp);

if (!temp.empty()) {

count++;

}

}

if (count != 0) {

// Allocate an array of strings to store the citations.

mCitations = new string[count];

mNumCitations = count;

// Seek back to the start of the citations.

istr.seekg(citationsStart);

// Read each citation and store it in the new array.

for (count = 0; count < mNumCitations; count++) {

string temp;

getline(istr, temp);

if (!temp.empty()) {

mCitations[count] = temp;

}

}

} else {

mNumCitations = -1;

}

}

Testing the ArticleCitations class

You decide to test your ArticleCitations class. The following program asks the user for a filename, constructs an ArticleCitations class with that filename, and passes the object by value to the processCitations() function, which prints out the info using the public accessor methods on the object:

void processCitations(ArticleCitations cit)

{

cout << cit.getArticle() << endl;

int num = cit.getNumCitations();

for (int i = 0; i < num; i++) {

cout << cit.getCitation(i) << endl;

}

}

int main()

{

string fileName;

while (true) {

cout << "Enter a file name (\"STOP\" to stop): ";

cin >> fileName;

if (fileName == "STOP") {

break;

}

// Test constructor

ArticleCitations cit(fileName);

processCitations(cit);

}

return 0;

}

Message-Based Debugging

You decide to test the program on the Alan Turing example (stored in a file called paper1.txt). Here is the output:

Enter a file name ("STOP" to stop): paper1.txt

Alan Turing, "On Computable Numbers, with an Application to the Entscheidungsproblem", Proceedings of the London Mathematical Society, Series 2, Vol.42 (1936-37), 230-265.

[ 4 empty lines omitted for brevity ]

Enter a file name ("STOP" to stop): STOP

That doesn’t look right. There are supposed to be four citations printed instead of four blank lines.

For this bug, you decide to try log-based debugging, and because this is a console example, you decide to just print messages to cout. In this case, it makes sense to start by looking at the function that reads the citations from the file. If that doesn’t work right, then obviously the object won’t have the citations. You can modify readFile() as follows:

void ArticleCitations::readFile(const string& fileName)

{

// Code omitted for brevity

// First count the number of citations.

cout << "readFile(): counting number of citations" << endl;

while (!istr.eof()) {

// Skip white space before the next entry.

istr >> ws;

string temp;

getline(istr, temp);

if (!temp.empty()) {

cout << "Citation " << count << ": " << temp << endl;

count++;

}

}

cout << "Found " << count << " citations" << endl;

cout << "readFile(): reading citations" << endl;

if (count != 0) {

// Allocate an array of strings to store the citations.

mCitations = new string[count];

mNumCitations = count;

// Seek back to the start of the citations.

istr.seekg(citationsStart);

// Read each citation and store it in the new array.

for (count = 0; count < mNumCitations; count++) {

string temp;

getline(istr, temp);

if (!temp.empty()) {

cout << temp << endl;

mCitations[count] = temp;

}

}

} else {

mNumCitations = -1;

}

cout << "readFile(): finished" << endl;

}

Running the same test with this program gives the following output:

Enter a file name ("STOP" to stop): paper1.txt

readFile(): counting number of citations

Citation 0: Gödel, "Über formal unentscheidbare Sätze der Principia Mathematica und verwandter Systeme, I", Monatshefte Math. Phys., 38 (1931), 173-198.

Citation 1: Alonzo Church. "An unsolvable problem of elementary number theory", American J. of Math., 58 (1936), 345-363.

Citation 2: Alonzo Church. "A note on the Entscheidungsproblem", J. of Symbolic Logic, 1 (1936), 40-41.

Citation 3: E.W. Hobson, "Theory of functions of a real variable (2nd ed., 1921)", 87-88.

Found 4 citations

readFile(): reading citations

readFile(): finished

Alan Turing, "On Computable Numbers, with an Application to the Entscheidungsproblem", Proceedings of the London Mathematical Society, Series 2, Vol.42 (1936-37), 230-265.

[ 4 empty lines omitted for brevity ]

Enter a file name ("STOP" to stop): STOP

As you can see from the output, the first time the program reads the citations from the file, in order to count them, they are read correctly. However, the second time, they are not read correctly; Nothing is printed between “readFile(): reading citations” and “readFile(): finished”. Why not? One way to delve deeper into this issue is to add some debugging code to check the state of the file stream after each attempt to read a citation:

void printStreamState(const istream& istr)

{

if (istr.good()) {

cout << "stream state is good" << endl;

}

if (istr.bad()) {

cout << "stream state is bad" << endl;

}

if (istr.fail()) {

cout << "stream state is fail" << endl;

}

if (istr.eof()) {

cout << "stream state is eof" << endl;

}

}

void ArticleCitations::readFile(const string& fileName)

{

// Code omitted for brevity

// First count the number of citations.

cout << "readFile(): counting number of citations" << endl;

while (!istr.eof()) {

// Skip white space before the next entry.

istr >> ws;

string temp;

getline(istr, temp);

printStreamState(istr);

if (!temp.empty()) {

cout << "Citation " << count << ": " << temp << endl;

count++;

}

}

cout << "Found " << count << " citations" << endl;

cout << "readFile(): reading citations" << endl;

if (count != 0) {

// Allocate an array of strings to store the citations.

mCitations = new string[count];

mNumCitations = count;

// Seek back to the start of the citations.

istr.seekg(citationsStart);

// Read each citation and store it in the new array.

for (count = 0; count < mNumCitations; count++) {

string temp;

getline(istr, temp);

printStreamState(istr);

if (!temp.empty()) {

cout << temp << endl;

mCitations[count] = temp;

}

}

} else {

mNumCitations = -1;

}

cout << "readFile(): finished" << endl;

}

When you run your program this time, you find some interesting information:

Enter a file name ("STOP" to stop): paper1.txt

readFile(): counting number of citations

stream state is good

Citation 0: Gödel, "Über formal unentscheidbare Sätze der Principia Mathematica und verwandter Systeme, I", Monatshefte Math. Phys., 38 (1931), 173-198.

stream state is good

Citation 1: Alonzo Church. "An unsolvable problem of elementary number theory", American J. of Math., 58 (1936), 345-363.

stream state is good

Citation 2: Alonzo Church. "A note on the Entscheidungsproblem", J. of Symbolic Logic, 1 (1936), 40-41.

stream state is good

Citation 3: E.W. Hobson, "Theory of functions of a real variable (2nd ed., 1921)", 87-88.

stream state is fail

stream state is eof

Found 4 citations

readFile(): reading citations

stream state is fail

stream state is fail

stream state is fail

stream state is fail

readFile(): finished

Alan Turing, "On Computable Numbers, with an Application to the Entscheidungsproblem", Proceedings of the London Mathematical Society, Series 2, Vol.42 (1936-37), 230-265.

[ 4 empty lines omitted for brevity ]

Enter a file name ("STOP" to stop): STOP

It looks like the stream state is good until after the final citation is read for the first time. Then, the stream state is fail and eof, because the end-of-file has been reached and istr >> ws still tries to read some white space. That is expected. What is not expected is that the stream state remains fail after all attempts to read the citations a second time. That doesn’t appear to make sense at first: the code uses seekg() to seek back to the beginning of the citations before reading them a second time.

However, Chapter 12 explains that streams maintain their error states until you clear them explicitly; seekg() doesn’t clear the fail state automatically. When in an error state, streams fail to read data correctly, which explains why the stream state is fail also after trying to read the citations a second time. A closer look at your method reveals that it fails to call clear() on the istream after reaching the end of the file. If you modify the method by adding a call to clear(), it will read the citations properly.

Here is the corrected readFile() method without the debugging cout statements:

void ArticleCitations::readFile(const string& fileName)

{

// Code omitted for brevity

if (count != 0) {

// Allocate an array of strings to store the citations.

mCitations = new string[count];

mNumCitations = count;

// Clear the stream state.

istr.clear();

// Seek back to the start of the citations.

istr.seekg(citationsStart);

// Read each citation and store it in the new array.

for (count = 0; count < mNumCitations; count++) {

string temp;

getline(istr, temp);

if (!temp.empty()) {

mCitations[count] = temp;

}

}

} else {

mNumCitations = -1;

}

}

Running the same test again on paper1.txt now shows you the correct four citations.

Using the GDB Debugger on Linux

Now that your ArticleCitations class seems to work well on one citations file, you decide to blaze ahead and test some special cases, starting with a file with no citations. The file looks like this, and is stored in a file named paper2.txt:

Author with no citations

When you try to run your program on this file, depending on the version of your Linux and your compiler, you might get a crash that looks something like the following:

Enter a file name ("STOP" to stop): paper2.txt

terminate called after throwing an instance of 'std::bad_alloc'

what(): std::bad_alloc

Aborted (core dumped)

The message “core dumped” means that the program crashed. This time you decide to give the debugger a shot. The Gnu DeBugger (gdb) is widely available on Unix and Linux platforms. First, you must compile your program with debugging info (-g with g++). Then you can launch the program under gdb. Here’s an example session using the debugger to find the root cause of this problem. This example assumes your compiled executable is called buggyprogram. Text that you have to type is shown in bold.

> gdb buggyprogram

[ Start-up messages omitted for brevity ]

Reading symbols from /home/marc/c++/gdb/buggyprogram...done.

(gdb) run

Starting program: buggyprogram

Enter a file name ("STOP" to stop): paper2.txt

terminate called after throwing an instance of 'std::bad_alloc'

what(): std::bad_alloc

Program received signal SIGABRT, Aborted.

0x00007ffff7535c39 in raise () from /lib64/libc.so.6

(gdb)

When the program crashes, the debugger breaks the execution, and allows you to poke around in the state of the program at that time. The backtrace or bt command shows the current stack trace. The last operation is at the top, with frame number zero, #0:

(gdb) bt

#0 0x00007ffff7535c39 in raise () from /lib64/libc.so.6

#1 0x00007ffff7537348 in abort () from /lib64/libc.so.6

#2 0x00007ffff7b35f85 in __gnu_cxx::__verbose_terminate_handler() () from /lib64/libstdc++.so.6

#3 0x00007ffff7b33ee6 in ?? () from /lib64/libstdc++.so.6

#4 0x00007ffff7b33f13 in std::terminate() () from /lib64/libstdc++.so.6

#5 0x00007ffff7b3413f in __cxa_throw () from /lib64/libstdc++.so.6

#6 0x00007ffff7b346cd in operator new(unsigned long) () from /lib64/libstdc++.so.6

#7 0x00007ffff7b34769 in operator new[](unsigned long) () from /lib64/libstdc++.so.6

#8 0x00000000004016ea in ArticleCitations::copy (this=0x7fffffffe090, src=...) at ArticleCitations.cpp:40

#9 0x00000000004015b5 in ArticleCitations::ArticleCitations (this=0x7fffffffe090, src=...)

at ArticleCitations.cpp:16

#10 0x0000000000401d0c in main () at ArticleCitationsTest.cpp:20

When you get a stack trace like this, you should try to find the first stack frame from the top that is in your own code. In this example, this is stack frame #8. From this frame you can see that there seems to be some sort of problem in the copy() method ofArticleCitations. This method is invoked because main() calls processCitations() and passes the argument by value, which triggers a call to the copy constructor, which calls copy(). Of course, in production code you should pass a const reference, but pass-by-value is used for this example of a buggy program. You can tell the debugger to switch to stack frame #8 with the frame command, which requires the index of the frame to jump to:

(gdb) frame 8

#8 0x00000000004016ea in ArticleCitations::copy (this=0x7fffffffe090, src=...) at ArticleCitations.cpp:40

40 mCitations = new string[mNumCitations];

This output shows that the following line caused a problem:

mCitations = new string[mNumCitations];

Now, use the list command to show the code in the current stack frame around the offending line:

(gdb) list

35 // Copy the article name, author, etc.

36 mArticle = src.mArticle;

37 // Copy the number of citations

38 mNumCitations = src.mNumCitations;

39 // Allocate an array of the correct size

40 mCitations = new string[mNumCitations];

41 // Copy each element of the array

42 for (int i = 0; i < mNumCitations; i++) {

43 mCitations[i] = src.mCitations[i];

44 }

In gdb, you can print values available in the current scope with the print command. In order to find the root cause of the problem, you can try printing some of the variables. The error happens inside the copy() method, so checking the value of the src parameter is a good start:

(gdb) print src

$1 = (const ArticleCitations &) @0x7fffffffe060: {

_vptr.ArticleCitations = 0x401fb0 <vtable for ArticleCitations+16>,

mArticle = "Author with no citations", mCitations = 0x7fffffffe080, mNumCitations = -1}

Ah-ha! Here’s the problem. This article isn’t supposed to have any citations. Why is mNumCitations set to the strange value -1? Take another look at the code in readFile() for the case that there are no citations. In that case, it looks like mNumCitations is erroneously set to-1. The fix is easy, you need to initialize mNumCitations to 0, instead of setting it to -1 when there are no citations. Another problem, readFile() can be called multiple times on the same ArticleCitations object, so you also need to free a previously allocated mCitationsarray. Here is the fixed code:

void ArticleCitations::readFile(const string& fileName)

{

// Code omitted for brevity

delete [] mCitations; // Free previously allocated citations.

mCitations = nullptr;

mNumCitations = 0;

if (count != 0) {

// Allocate an array of strings to store the citations.

mCitations = new string[count];

mNumCitations = count;

// Clear the previous eof.

istr.clear();

// Seek back to the start of the citations.

istr.seekg(citationsStart);

// Read each citation and store it in the new array.

// Code omitted for brevity

}

}

As this example shows, bugs don’t always show up right away. It often takes a debugger and some persistence to figure them out.

NOTE If you attempt to replicate this debugging session on a different platform, you may find that, due to the vagaries of memory errors, the program crashes in a different place than this example shows.

Using the Visual C++ 2013 Debugger

This section explains the same debugging procedure as described in the previous section, but uses the Microsoft Visual C++ 2013 debugger instead of gdb.

First, you need to create a project. Start VC++ and click on File images New images Project. In the project template tree on the left, select Visual C++ images Win32. Then select the Win32 Console Application template in the list in the middle of the window. At the bottom you can give a name for the project and a location where to save it. Specify ArticleCitations as the name, choose a folder where to save the project, and click OK. A wizard opens. Click Next, select Console application and Empty Project, and click Finish.

Once your new project is loaded, you can see a list of project files in the Solution Explorer. If this docking window is not visible, go to View images Solution Explorer. Right-click the ArticleCitations project in the Solution Explorer and click Add images Existing Item. Add all the files from the ArticleCitations\05_VisualStudio folder in the downloadable code archive to the project. Your Solution Explorer should look similar to Figure 26-1.

image

FIGURE 26-1

Now you can compile the program; click Build images Build Solution. Copy the paper1.txt and paper2.txt test files to your ArticleCitations project folder, which is the folder containing the ArticleCitations.vcxproj file.

Run the application with Debug images Start Debugging, and test the program by first specifying the paper1.txt file. It should properly read the file and output the result to the console. Then, type paper2.txt. A Microsoft Visual C++ Runtime Library message will be displayed with three buttons: Abort, Retry, and Ignore. Click Retry, which causes the VC++ debugger to break the execution. You will get a message saying “ArticleCitations.exe has triggered a breakpoint.” in which you need to click Break.

At this point, you should inspect the call stack, Debug images Windows images Call Stack. In this call stack, you need to find the first line that contains code that you wrote. This is shown in Figure 26-2.

image

FIGURE 26-2

Just as with gdb, you see that the problem is in copy(). You can double-click that line in the call stack window to jump to the right place in the code. If you only see disassembly code, right click anywhere on the disassembly and select Go To Source Code. Then click Debug images Windows images Autos to inspect variables. In the list of variables you can find src. Click the plus sign to expand the data members of the src variable. Figure 26-3 shows how it looks.

image

FIGURE 26-3

From this window, you see that mNumCitations is -1. The reason and the fix are exactly the same as earlier.

Lessons from the ArticleCitations Example

You might be inclined to disregard this example as too small to be representative of real debugging. Although the buggy code is not lengthy, many classes that you write will not be much bigger, even in large projects. Imagine if you had failed to test this example thoroughly before integrating it with the rest of the project. If these bugs showed up later, you and other engineers would have to spend more time narrowing down the problem before you could debug it as shown here. Additionally, the techniques shown in this example apply to all debugging, large or small scale.

SUMMARY

The most important concept in this chapter was the Fundamental Law of Debugging: avoid bugs when you’re coding, but plan for bugs in your code. The reality of programming is that bugs will appear. If you’ve prepared your program properly, with error logging, debug traces, assertions, and static assertions, then the actual debugging will be significantly easier.

In addition to these techniques, this chapter also presented specific approaches for debugging bugs. The most important rule when actually debugging is to reproduce the problem. Then, you can use a symbolic debugger or log-based debugging to track down the root cause. Memory errors present particular difficulties, and account for the majority of bugs in C++ code. This chapter described the various categories of memory bugs and their symptoms, and showed examples of debugging errors in a program.

Debugging techniques are a great way to end your journey through Professional C++. By thinking through your designs, experimenting with different approaches in object-oriented programming, selectively adding new techniques to your coding repertoire, and practicing debugging techniques, you’ll be able to take your C++ skills to the professional level.