C++ Quirks, Oddities, and Incidentals - Coding the Professional Way - Professional C++ (2014)

Professional C++ (2014)

Part IIICoding the Professional Way

Chapter 10C++ Quirks, Oddities, and Incidentals

WHAT’S IN THIS CHAPTER?

· What the different use-cases are for references

· Keyword confusion

· How to use typedefs and type aliases

· What scope resolution is

· Details of features that do not fit elsewhere in this book

WROX.COM DOWNLOADS FOR THIS CHAPTER

Please note that all the code examples for this chapter are available as a part of this chapter’s code download on the book’s website at www.wrox.com/go/proc++3e on the Download Code tab.

Many parts of the C++ language have tricky syntax or quirky semantics. As a C++ programmer, you grow accustomed to most of this idiosyncratic behavior; it starts to feel natural. However, some aspects of C++ are a source of perennial confusion. Either books never explain them thoroughly enough, or you forget how they work and continually look them up, or both. This chapter addresses this gap by providing clear explanations for some of C++’s most niggling quirks and oddities.

Many language idiosyncrasies are covered in various chapters throughout this book. This chapter tries not to repeat those topics by limiting itself to subjects that are not covered in detail elsewhere in the book. There is a bit of redundancy with other chapters, but the material is “sliced” in a different way in order to provide you with a new perspective.

The topics of this chapter include references, const, constexpr, static, extern, typedefs, type aliases, casts, scope resolution, uniform initialization, initializer lists, explicit conversion operators, attributes, user-defined literals, header files, variable-length argument lists, and preprocessor macros. Although this list might appear to be a hodgepodge of topics, it is a carefully selected collection of features and confusing aspects of the language.

REFERENCES

Professional C++ code, including much of the code in this book, uses references extensively. It is helpful to step back and think about what exactly references are, and how they behave.

A reference in C++ is an alias for another variable. All modifications to the reference change the value of the variable to which it refers. You can think of references as implicit pointers that save you the trouble of taking the address of variables and dereferencing the pointer. Alternatively, you can think of references as just another name for the original variable. You can create stand-alone reference variables, use reference data members in classes, accept references as parameters to functions and methods, and return references from functions and methods.

Reference Variables

Reference variables must be initialized as soon as they are created, like this:

int x = 3;

int& xRef = x;

Subsequent to this assignment, xRef is another name for x. Any use of xRef uses the current value of x. Any assignment to xRef changes the value of x. For example, the following code sets x to 10 through xRef:

xRef = 10;

You cannot declare a reference variable outside of a class without initializing it:

int& emptyRef; // DOES NOT COMPILE!

WARNING You must always initialize a reference when it is created. Usually, references are created when they are declared, but reference data members need to be initialized in the constructor initializer for the containing class.

You cannot create a reference to an unnamed value, such as an integer literal, unless the reference is to a const value. In the following example, unnamedRef1 will not compile because it is a non-const reference to a constant. That would mean you could change the value of the constant, 5, which doesn’t make sense. unnamedRef2 works because it’s a const reference, so you cannot write “unnamedRef2 = 7”.

int& unnamedRef1 = 5; // DOES NOT COMPILE

const int& unnamedRef2 = 5; // Works

Modifying References

A reference always refers to the same variable to which it is initialized; references cannot be changed once they are created. This rule leads to some confusing syntax. If you “assign” a variable to a reference when the reference is declared, the reference refers to that variable. However, if you assign a variable to a reference after that, the variable to which the reference refers is changed to the value of the variable being assigned. The reference is not updated to refer to that variable. Here is a code example:

int x = 3, y = 4;

int& xRef = x;

xRef = y; // Changes value of x to 4. Doesn't make xRef refer to y.

You might try to circumvent this restriction by taking the address of y when you assign it:

int x = 3, y = 4;

int& xRef = x;

xRef = &y; // DOES NOT COMPILE!

This code does not compile. The address of y is a pointer, but xRef is declared as a reference to an int, not a reference to a pointer.

Some programmers go even further in attempts to circumvent the intended semantics of references. What if you assign a reference to a reference? Won’t that make the first reference refer to the variable to which the second reference refers? You might be tempted to try this code:

int x = 3, z = 5;

int& xRef = x;

int& zRef = z;

zRef = xRef; // Assigns values, not references

The final line does not change zRef. Instead, it sets the value of z to 3, because xRef refers to x, which is 3.

WARNING You cannot change the variable to which a reference refers after it is initialized; you can change only the value of that variable.

References to Pointers and Pointers to References

You can create references to any type, including pointer types. Here is an example of a reference to a pointer to int:

int* intP;

int*& ptrRef = intP;

ptrRef = new int;

*ptrRef = 5;

The syntax is a little strange: You might not be accustomed to seeing * and & right next to each other. However, the semantics are straightforward: ptrRef is a reference to intP, which is a pointer to int. Modifying ptrRef changes intP. References to pointers are rare, but can occasionally be useful, as discussed in the “Reference Parameters” section later in this chapter.

Note that taking the address of a reference gives the same result as taking the address of the variable to which the reference refers. For example:

int x = 3;

int& xRef = x;

int* xPtr = &xRef; // Address of a reference is pointer to value

*xPtr = 100;

This code sets xPtr to point to x by taking the address of a reference to x. Assigning 100 to *xPtr changes the value of x to 100. Writing a comparison “xPtr == xRef” will not compile because of a type mismatch; xPtr is a pointer to an int while xRef is a reference to anint. The comparisons “xPtr == &xRef” and “xPtr == &x” both compile without errors and are both true.

Finally, note that you cannot declare a reference to a reference, or a pointer to a reference.

Reference Data Members

As Chapter 8 explains, data members of classes can be references. A reference cannot exist without referring to some other variable. Thus, you must initialize reference data members in the constructor initializer, not in the body of the constructor. The following is a quick example:

class MyClass

{

public:

MyClass(int& ref) : mRef(ref) {}

private:

int& mRef;

};

Consult Chapter 8 for details.

Reference Parameters

C++ programmers do not often use stand-alone reference variables or reference data members. The most common use of references is for parameters to functions and methods. Recall that the default parameter-passing semantics are pass-by-value: Functions receive copies of their arguments. When those parameters are modified, the original arguments remain unchanged. References allow you to specify pass-by-reference semantics for arguments passed to the function. When you use reference parameters, the function receives references to the function arguments. If those references are modified, the changes are reflected in the original argument variables. For example, here is a simple swap function to swap the values of two ints:

void swap(int& first, int& second)

{

int temp = first;

first = second;

second = temp;

}

You can call it like this:

int x = 5, y = 6;

swap(x, y);

When the function swap() is called with the arguments x and y, the first parameter is initialized to refer to x, and the second parameter is initialized to refer to y. When swap() modifies first and second, x and y are actually changed.

Just as you can’t initialize normal reference variables with constants, you can’t pass constants as arguments to functions that employ pass-by-reference:

swap(3, 4); // DOES NOT COMPILE

NOTE Using rvalue references, it is possible to pass constants as arguments to functions that employ pass-by-rvalue-reference. Rvalue references are discussed later in this chapter.

References from Pointers

A common quandary arises when you have a pointer to something that you need to pass to a function or method that takes a reference. You can “convert” a pointer to a reference in this case by dereferencing the pointer. This action gives you the value to which the pointer points, which the compiler then uses to initialize the reference parameter. For example, you can call swap() like this:

int x = 5, y = 6;

int *xp = &x, *yp = &y;

swap(*xp, *yp);

Pass-by-Reference Versus Pass-by-Value

Pass-by-reference is required when you want to modify the parameter and see those changes reflected in the variable passed to the function or method. However, you should not limit your use of pass-by-reference to only those cases. Pass-by-reference avoids copying the arguments to the function, providing two additional benefits in some cases:

1. Efficiency: Large objects and structs could take a long time to copy. Pass-by-reference passes only a pointer to the object or struct into the function.

2. Correctness: Not all objects allow pass-by-value. Even those that do allow it might not support deep copying correctly. As Chapter 8 explains, objects with dynamically allocated memory must provide a custom copy constructor in order to support deep copying.

If you want to leverage these benefits, but do not want to allow the original objects to be modified, you can mark the parameters const, giving you pass-by-const-reference. This topic is covered in detail later in this chapter.

These benefits to pass-by-reference imply that you should use pass-by-value only for simple built-in types like int and double for which you don’t need to modify the arguments. Use pass-by-reference in all other cases.

Reference Return Values

You can also return a reference from a function or method. The main reason to do so is efficiency. Instead of returning a whole object, return a reference to the object to avoid copying it unnecessarily. Of course, you can only use this technique if the object in question will continue to exist following the function termination.

WARNING From a function or method, never return a reference to a variable that is locally scoped to that function or method, such as an automatically allocated variable on the stack that will be destroyed when the function ends.

If the type you want to return from your function supports move semantics, discussed later in this chapter, then returning it by value is almost as efficient as returning a reference.

A second reason to return a reference is if you want to be able to assign to the return value directly as an lvalue (the left-hand side of an assignment statement). Several overloaded operators commonly return references. Chapter 8 shows some examples, and you can read about more applications of this technique in Chapter 14.

Deciding between References and Pointers

References in C++ could be considered redundant: almost everything you can do with references, you can accomplish with pointers. For example, you could write the previously shown swap() function like this:

void swap(int* first, int* second)

{

int temp = *first;

*first = *second;

*second = temp;

}

However, this code is more cluttered than the version with references: References make your programs cleaner and easier to understand. References are also safer than pointers: It’s impossible to have a null reference, and you don’t explicitly dereference references, so you can’t encounter any of the dereferencing errors associated with pointers. These arguments, saying that references are safer, are only valid in the absence of any pointers. For example, take the following function that accepts a reference to an int:

void refcall(int& t) { ++t; }

You could declare a pointer and initialize it to point to some random place in memory. Then you could dereference this pointer and pass it as the reference argument to refcall(), as in the following code. This code will compile but will crash on execution.

int* ptr = (int*)8;

refcall(*ptr);

Most of the time, you can use references instead of pointers. References to objects even support polymorphism in the same way as pointers to objects. The only case in which you need to use a pointer is when you need to change the location to which it points. Recall that you cannot change the variable to which references refer. For example, when you dynamically allocate memory, you need to store a pointer to the result in a pointer rather than a reference. A second use-case in which you need to use a pointer is for optional parameters. A pointer parameter can, for example, be defined as optional with a default value of nullptr, something that is not possible with a reference parameter.

Another way to distinguish between appropriate use of pointers and references in parameters and return types is to consider who owns the memory. If the code receiving the variable becomes the owner and thus becomes responsible for releasing the memory associated with an object, it must receive a pointer to the object, or better yet a smart pointer, which is the recommended way to transfer ownership. If the code receiving the variable should not free the memory, it should receive a reference.

NOTE Use references instead of pointers unless you need to change to where the reference refers to.

Consider a function that splits an array of ints into two arrays: one of even numbers and one of odd numbers. The function doesn’t know how many numbers in the source array will be even or odd, so it should dynamically allocate the memory for the destination arrays after examining the source array. It should also return the sizes of the two new arrays. Altogether, there are four items to return: pointers to the two new arrays and the sizes of the two new arrays. Obviously, you must use pass-by-reference. The canonical C way to write the function looks like this:

void separateOddsAndEvens(const int arr[], int size, int** odds,

int* numOdds, int** evens, int* numEvens)

{

// Count the number of odds and evens

*numOdds = *numEvens = 0;

for (int i = 0; i < size; ++i) {

if (arr[i] % 2 == 1) {

++(*numOdds);

} else {

++(*numEvens);

}

}

// Allocate two new arrays of the appropriate size.

*odds = new int[*numOdds];

*evens = new int[*numEvens];

// Copy the odds and evens to the new arrays

int oddsPos = 0, evensPos = 0;

for (int i = 0; i < size; ++i) {

if (arr[i] % 2 == 1) {

(*odds)[oddsPos++] = arr[i];

} else {

(*evens)[evensPos++] = arr[i];

}

}

}

The final four parameters to the function are the “reference” parameters. In order to change the values to which they refer, separateOddsAndEvens() must dereference them, leading to some ugly syntax in the function body. Additionally, when you want to callseparateOddsAndEvens(), you must pass the address of two pointers so that the function can change the actual pointers, and the address of two ints so that the function can change the actual ints:

int unSplit[10] = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10};

int *oddNums, *evenNums;

int numOdds, numEvens;

separateOddsAndEvens(unSplit, 10, &oddNums, &numOdds, &evenNums, &numEvens);

If such syntax annoys you (which it should), you can write the same function by using references to obtain true pass-by-reference semantics:

void separateOddsAndEvens(const int arr[], int size, int*& odds,

int& numOdds, int*& evens, int& numEvens)

{

numOdds = numEvens = 0;

for (int i = 0; i < size; ++i) {

if (arr[i] % 2 == 1) {

++numOdds;

} else {

++numEvens;

}

}

odds = new int[numOdds];

evens = new int[numEvens];

int oddsPos = 0, evensPos = 0;

for (int i = 0; i < size; ++i) {

if (arr[i] % 2 == 1) {

odds[oddsPos++] = arr[i];

} else {

evens[evensPos++] = arr[i];

}

}

}

In this case, the odds and evens parameters are references to int*s. separateOddsAndEvens() can modify the int*s that are used as arguments to the function (through the reference), without any explicit dereferencing. The same logic applies to numOdds and numEvens, which are references to ints. With this version of the function, you no longer need to pass the addresses of the pointers or ints. The reference parameters handle it for you automatically:

int unSplit[10] = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10};

int *oddNums, *evenNums;

int numOdds, numEvens;

separateOddsAndEvens(unSplit, 10, oddNums, numOdds, evenNums, numEvens);

It’s recommended to avoid dynamically allocated arrays as much as possible. For example, by using the STL vector container, the previous separateOddsAndEvens() can be rewritten to be more safe and elegant, because all memory allocation and deallocation happens automatically:

void separateOddsAndEvens(const vector<int>& arr,

vector<int>& odds, vector<int>& evens)

{

int numOdds = 0, numEvens = 0;

for (auto& i : arr) {

if (i % 2 == 1) {

++numOdds;

} else {

++numEvens;

}

}

odds.reserve(numOdds);

evens.reserve(numEvens);

for (auto& i : arr) {

if (i % 2 == 1) {

odds.push_back(i);

} else {

evens.push_back(i);

}

}

}

This version can be used as follows:

vector<int> vecUnSplit = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10};

vector<int> odds, evens;

separateOddsAndEvens(vecUnSplit, odds, evens);

The STL vector container is discussed in detail in Chapter 16.

Rvalue References

In C++, an lvalue is something of which you can take an address; a named variable, for example. The name comes from the fact that they normally appear on the left-hand side of an assignment. An rvalue on the other hand is anything that is not an lvalue such as a constant value, or a temporary object or value. Typically an rvalue is on the right-hand side of an assignment operator.

An rvalue reference is a reference to an rvalue. In particular, it is a concept that is applied when the rvalue is a temporary object. The purpose of an rvalue reference is to make it possible for a particular function to be chosen when a temporary object is involved. The consequence of this is that certain operations that normally involve copying large values can be implemented by copying pointers to those values, knowing the temporary object will be destroyed.

A function can specify an rvalue reference parameter by using && as part of the parameter specification; e.g., type&& name. Normally, a temporary object will be seen as a const type&, but when there is a function overload that uses an rvalue reference, a temporary object can be resolved to that overload. The following example demonstrates this. The code first defines two incr() functions, one accepting an lvalue reference and one accepting an rvalue reference.

// Increment value using lvalue reference parameter.

void incr(int& value)

{

cout << "increment with lvalue reference" << endl;

++value;

}

// Increment value using rvalue reference parameter.

void incr(int&& value)

{

cout << "increment with rvalue reference" << endl;

++value;

}

You can call the incr() function with a named variable as argument. Because a is a named variable, the incr() function accepting an lvalue reference is called. After the call to incr(), the value of a will be 11.

int a = 10, b = 20;

incr(a); // Will call incr(int& value)

You can also call the incr() function with an expression as argument. The incr() function accepting an lvalue reference cannot be used, because the expression a + b results in a temporary, which is not an lvalue. In this case the rvalue reference version is called. Since the argument is a temporary, the incremented value is lost after the call to incr().

incr(a + b); // Will call incr(int&& value)

A literal can also be used as argument to the incr() call. This will also trigger a call to the rvalue reference version because a literal cannot be an lvalue.

incr(3); // Will call incr(int&& value)

If you remove the incr() function accepting an lvalue reference, calling incr() with a named variable like incr(b) will result in a compiler error because an rvalue reference parameter (int&& value) will never be bound to an lvalue (b). You can force the compiler to call the rvalue reference version of incr() by using std::move(), which converts an lvalue into an rvalue as follows. After the incr() call, the value of b will be 21.

incr(std::move(b)); // Will call incr(int&& value)

Rvalue references are not limited to parameters of functions. You can declare a variable of type rvalue reference, and assign to it, although this usage is uncommon. Consider the following code, which is illegal in C++:

int& i = 2; // Invalid: reference to a constant

int a = 2, b = 3;

int& j = a + b; // Invalid: reference to a temporary

Using rvalue references, the following is perfectly legal:

int&& i = 2;

int a = 2, b = 3;

int&& j = a + b;

Stand-alone rvalue references, as in the preceding example, are rarely used as such.

Move Semantics

Move semantics for objects requires a move constructor and a move assignment operator. These will be used by the compiler on places where the source object is a temporary object that will be destroyed after the copy or assignment. Both the move constructor and the move assignment operator copy/move the member variables from the source object to the new object and then reset the variables of the source object to null values. By doing this, they are actually moving ownership of the memory from one object to another object. They basically do a shallow copy of the member variables and switch ownership of allocated memory to prevent dangling pointers or memory leaks.

Move semantics is implemented by using rvalue references. To add move semantics to a class, a move constructor and a move assignment operator need to be implemented. Move constructors and move assignment operators should be marked with the noexceptqualifier to tell the compiler that they don’t throw any exceptions. This is particularly important for compatibility with the standard library, as fully compliant implementations of the standard library will only move stored objects if, having move semantics implemented, they also guarantee not to throw. Following is the Spreadsheet class definition from Chapter 8, with a move constructor and move assignment operator added:

class Spreadsheet

{

public:

Spreadsheet(Spreadsheet&& src) noexcept; // Move constructor

Spreadsheet& operator=(Spreadsheet&& rhs) noexcept; // Move assignment

// Remaining code omitted for brevity

};

The implementation is as follows. Note that a helper method is introduced called freeMemory(), which deallocates the mCells array (not shown). This helper method is called from the destructor, the normal assignment operator, and the move assignment operator. Similarly, a helper method can be added to move the data from a source object to a destination object, which can then be used from the move constructor and the move assignment operator.

// Move constructor

Spreadsheet::Spreadsheet(Spreadsheet&& src) noexcept

{

// Shallow copy of data

mWidth = src.mWidth;

mHeight = src.mHeight;

mCells = src.mCells;

// Reset the source object, because ownership has been moved!

src.mWidth = 0;

src.mHeight = 0;

src.mCells = nullptr;

}

// Move assignment operator

Spreadsheet& Spreadsheet::operator=(Spreadsheet&& rhs) noexcept

{

// check for self-assignment

if (this == &rhs) {

return *this;

}

// free the old memory

freeMemory();

// Shallow copy of data

mWidth = rhs.mWidth;

mHeight = rhs.mHeight;

mCells = rhs.mCells;

// Reset the source object, because ownership has been moved!

rhs.mWidth = 0;

rhs.mHeight = 0;

rhs.mCells = nullptr;

return *this;

}

Both the move constructor and the move assignment operator are moving ownership of the memory for mCells from the source object to the new object. They reset the mCells pointer of the source object to a null pointer to prevent the destructor of the source object to deallocate that memory because now the new object is the owner of that memory.

The preceding move constructor and move assignment operator can be tested with the following code:

Spreadsheet CreateObject()

{

return Spreadsheet(3, 2);

}

int main()

{

vector<Spreadsheet> vec;

for (int i = 0; i < 2; ++i) {

cout << "Iteration " << i << endl;

vec.push_back(Spreadsheet(100, 100));

cout << endl;

}

Spreadsheet s(2,3);

s = CreateObject();

Spreadsheet s2(5,6);

s2 = s;

return 0;

}

Chapter 1 introduces the vector. A vector grows dynamically in size to accommodate new objects. This is done by allocating a bigger chunk of memory and then copying or moving the objects from the old vector to the new and bigger vector. If the compiler finds a move constructor, the objects will be moved instead of copied. By moving them there is no need for any deep copying, making it much more efficient.

When you add output statements to all constructors and assignment operators of the Spreadsheet class, the output of the preceding test program will be as follows. This output and the following discussion is based on Microsoft Visual C++ 2013. The C++ standard does not specify the initial capacity of a vector nor its growth strategy, so the output can be different on different compilers.

Iteration 0

Normal constructor (1)

Move constructor (2)

Iteration 1

Normal constructor (3)

Move constructor (4)

Move constructor (5)

Normal constructor (6)

Normal constructor (7)

Move assignment operator (8)

Normal constructor (9)

Assignment operator (10)

On the first iteration of the loop, the vector is still empty. Take the following line of code from the loop:

vec.push_back(Spreadsheet(100, 100));

With this line, a new Spreadsheet object is created invoking the normal constructor (1). The vector resizes itself to make space for the new object being pushed in. The created Spreadsheet object is then moved into the vector, invoking the move constructor (2).

On the second iteration of the loop, a second Spreadsheet object is created with the normal constructor (3). At this point, the vector can hold one element, so it’s again resized to make space for a second object. By resizing the vector, the previously added elements need to be moved from the old vector to the new and bigger vector, so this will trigger a call to the move constructor for each previously added element (4). Then, the new Spreadsheet object is moved into the vector with its move constructor (5).

Next, a Spreadsheet object s is created using the normal constructor (6). The CreateObject() function creates a temporary Spreadsheet object with its normal constructor (7), which is then returned from the function and move-assigned to the variable s (8). Because the temporary object will cease to exist after the assignment, the compiler will invoke the move assignment operator instead of the normal copy assignment operator. On the other hand, the assignment s2 = s will invoke the copy assignment operator (10) because the right-hand side object is not a temporary object, but a named object.

If the Spreadsheet class did not include a move constructor and move assignment operator, the above output would look as follows:

Iteration 0

Normal constructor

Copy constructor

Iteration 1

Normal constructor

Copy constructor

Copy constructor

Normal constructor

Normal constructor

Assignment operator

Normal constructor

Assignment operator

As you can see, copy constructors are called instead of move constructors and copy assignment operators are called instead of move assignment operators. In the previous example, the Spreadsheet objects in the loop have 10,000 (100 x 100) elements. The implementation of the Spreadsheet move constructor and move assignment operator don’t require any memory allocation, while the copy constructor and copy assignment operator require 101 allocations. So, using move semantics can increase performance a lot in certain situations.

Move constructors and move assignment operators can also be explicitly deleted or defaulted, just like normal constructors and normal copy assignment operators, as explained in Chapter 7.

As another example where move semantics increases performance, take a swap() function that swaps two objects. The following swapCopy() implementation does not use move semantics:

void swapCopy(T& a, T& b)

{

T temp(a);

a = b;

b = temp;

}

This implementation first copies a to temp, then copies b to a, and then copies temp to b. If type T is expensive to copy, this swap implementation will hurt performance. With move semantics the swap() function can avoid all copying:

void swapMove(T& a, T& b)

{

T temp(std::move(a));

a = std::move(b);

b = std::move(temp);

}

Obviously, move semantics is useful only when you know that the source object will be destroyed.

KEYWORD CONFUSION

Two keywords in C++ appear to cause more confusion than any others: const and static. Both of these keywords have several different meanings, and each of their uses presents subtleties that are important to understand.

The const Keyword

The keyword const is short for “constant” and specifies that something remains unchanged. The compiler will enforce this requirement by marking any attempt to change it as an error. Furthermore, when optimizations are enabled, the compiler can take advantage of this knowledge to produce better code. The keyword has two related roles. It can mark variables or parameters, and it can mark methods. This section provides a definitive discussion of these two meanings.

const Variables and Parameters

You can use const to “protect” variables by specifying that they cannot be modified. One important use is as a replacement for #define to define constants. This use of const is its most straightforward application. For example, you could declare the constant PI like this:

const double PI = 3.141592653589793238462;

You can mark any variable const, including global variables and class data members.

You can also use const to specify that parameters to functions or methods should remain unchanged. For example, the following function accepts a const parameter. In the body of the function, you cannot modify the param integer. If you do try to modify it, the compiler will generate an error.

void func(const int param)

{

// Not allowed to change param...

}

The following subsections discuss two special kinds of const variables or parameters in more detail: const pointers and const references.

const Pointers

When a variable contains one or more levels of indirection via a pointer, applying const becomes trickier. Consider the following lines of code:

int* ip;

ip = new int[10];

ip[4] = 5;

Suppose that you decide to apply const to ip. Set aside your doubts about the usefulness of doing so for a moment, and consider what it means. Do you want to prevent the ip variable itself from being changed, or do you want to prevent the values to which it points from being changed? That is, do you want to prevent the second line or the third line in the previous example?

In order to prevent the pointed-to values from being modified (as in the third line), you can add the keyword const to the declaration of ip like this:

const int* ip;

ip = new int[10];

ip[4] = 5; // DOES NOT COMPILE!

Now you cannot change the values to which ip points.

An alternative but semantically equivalent way to write this is as follows:

int const* ip;

ip = new int[10];

ip[4] = 5; // DOES NOT COMPILE!

Putting the const before or after the int makes no difference in its functionality.

If you want instead to mark ip itself const (not the values to which it points), you need to write this:

int* const ip = nullptr;

ip = new int[10]; // DOES NOT COMPILE!

ip[4] = 5; // Error: dereferencing a null pointer

Now that ip itself cannot be changed, the compiler requires you to initialize it when you declare it, either with nullptr as in the preceding code or with newly allocated memory as follows:

int* const ip = new int[10];

ip[4] = 5;

You can also mark both the pointer and the values to which it points const like this:

int const* const ip = nullptr;

An alternative but equivalent syntax is the following:

const int* const ip = nullptr;

Although this syntax might seem confusing, there is actually a very simple rule: the const keyword applies to whatever is directly to its left. Consider this line again:

int const* const ip = nullptr;

From left to right, the first const is directly to the right of the word int. Thus, it applies to the int to which ip points. Therefore, it specifies that you cannot change the values to which ip points. The second const is directly to the right of the *. Thus, it applies to the pointer to the int, which is the ip variable. Therefore, it specifies that you cannot change ip (the pointer) itself.

The reason this rule becomes confusing is an exception: The first const can go before the variable like this:

const int* const ip = nullptr;

This “exceptional” syntax is used much more commonly than the other syntax.

You can extend this rule to any number of levels of indirection. For example:

const int * const * const * const ip = nullptr;

NOTE Another easy-to-remember rule to figure out complicated variable declarations: read from right to left. Take for example “int* const ip.” Reading this from right to left gives us “ip is a const pointer to an int.” On the other hand, “int const* ip” will read as “ip is a pointer to a const int.”

const References

const applied to references is usually simpler than const applied to pointers for two reasons. First, references are const by default, in that you can’t change to what they refer. So, there is no need to mark them const explicitly. Second, you can’t create a reference to a reference, so there is usually only one level of indirection with references. The only way to get multiple levels of indirection is to create a reference to a pointer.

Thus, when C++ programmers refer to a “const reference,” they mean something like this:

int z;

const int& zRef = z;

zRef = 4; // DOES NOT COMPILE

By applying const to the int, you prevent assignment to zRef, as shown. Remember that const int& zRef is equivalent to int const& zRef. Note, however, that marking zRef const has no effect on z. You can still modify the value of z by changing it directly instead of through the reference.

const references are used most commonly as parameters, where they are quite useful. If you want to pass something by reference for efficiency, but don’t want it to be modifiable, make it a const reference. For example:

void doSomething(const BigClass& arg)

{

// Implementation here

}

WARNING Your default choice for passing objects as parameters should be const reference. You should only omit the const if you explicitly need to change the object.

const Methods

Chapter 8 explains that you can mark a class method const, which prevents the method from modifying any non-mutable data members of the class. Consult Chapter 8 for an example.

The constexpr Keyword

C++ always had the notion of constant expressions and in some circumstances constant expressions are required. For example, when defining an array, the size of the array needs to be a constant expression. Because of this restriction, the following piece of code is not valid in C++.

const int getArraySize() { return 32; }

int main()

{

int myArray[getArraySize()]; // Invalid in C++

return 0;

}

Using the constexpr keyword, the getArraySize() function can be redefined to make it a constant expression. Constant expressions are evaluated at compile time.

constexpr int getArraySize() { return 32; }

int main()

{

int myArray[getArraySize()]; // OK

return 0;

}

You can even do something like this:

int myArray[getArraySize() + 1]; // OK

Declaring a function as constexpr imposes quite a lot of restrictions on what the function can do because the compiler has to be able to evaluate the function at compile time, and the function is not allowed to have any side effects. Here are a couple of restrictions:

· The function body shall be a single return statement that does not contain a goto statement or a try catch block, and does not throw any exceptions. It is allowed to call other constexpr functions.

· The return type of the function shall be a literal type. It cannot be void.

· If the constexpr function is a member of a class, the function cannot be virtual.

· All the function arguments shall be literal types.

· A constexpr function cannot be called until it’s defined in the translation unit because the compiler needs to know the complete definition.

· dynamic_cast is not allowed.

· new and delete are not allowed.

By defining a constexpr constructor you can create constant expression variables of user-defined types. A constexpr constructor should satisfy the following requirements.

· All the constructor arguments should be literal types.

· The constructor body cannot be a function-try-block.

· The constructor body should satisfy the same requirements as the body of a constexpr function.

· All data members should be initialized with constant expressions.

For example, the following Rect class defines a constexpr constructor satisfying the previous requirements and also defines a constexpr getArea() method that is performing some calculation.

class Rect

{

public:

constexpr Rect(int width, int height)

: mWidth(width), mHeight(height) {}

constexpr int getArea() const { return mWidth * mHeight; }

private:

int mWidth, mHeight;

};

Using this class to declare a constexpr object is straightforward:

constexpr Rect r(8, 2);

int myArray[r.getArea()]; // OK

The static Keyword

There are several uses of the keyword static in C++, all seemingly unrelated. Part of the motivation for “overloading” the keyword was attempting to avoid having to introduce new keywords into the language.

static Data Members and Methods

You can declare static data members and methods of classes. static data members, unlike non-static data members, are not part of each object. Instead, there is only one copy of the data member, which exists outside any objects of that class.

static methods are similarly at the class level instead of the object level. A static method does not execute in the context of a specific object.

Chapter 8 provides examples of both static data members and methods.

static Linkage

Before covering the use of the static keyword for linkage, you need to understand the concept of linkage in C++. C++ source files are each compiled independently, and the resulting object files are linked together. Each name in a C++ source file, including functions and global variables, has a linkage that is either internal or external. External linkage means that the name is available from other source files. Internal linkage (also called static linkage) means that it is not. By default, functions and global variables have external linkage. However, you can specify internal (or static) linkage by prefixing the declaration with the keyword static. For example, suppose you have two source files: FirstFile.cpp and AnotherFile.cpp. Here is FirstFile.cpp:

void f();

int main()

{

f();

return 0;

}

Note that this file provides a prototype for f(), but doesn’t show the definition. Here is AnotherFile.cpp:

#include <iostream>

using namespace std;

void f();

void f()

{

cout << "f\n";

}

This file provides both a prototype and a definition for f(). Note that it is legal to write prototypes for the same function in two different files. That’s precisely what the preprocessor does for you if you put the prototype in a header file that you #include in each of the source files. The reason to use header files is that it’s easier to maintain (and keep synchronized) one copy of the prototype. However, for this example I don’t use a header file.

Each of these source files compiles without error, and the program links fine: because f() has external linkage, main() can call it from a different file.

However, suppose you apply static to the f() prototype in AnotherFile.cpp. Note that you don’t need to repeat the static keyword in front of the definition of f(). As long as it precedes the first instance of the function name, there is no need to repeat it:

#include <iostream>

using namespace std;

static void f();

void f()

{

cout << "f\n";

}

Now each of the source files compiles without error, but the linker step fails because f() has internal (static) linkage, making it unavailable from FirstFile.cpp. Some compilers issue a warning when static methods are defined but not used in that source file (implying that they shouldn’t be static, because they’re probably used elsewhere).

An alternative to using static for internal linkage is to employ anonymous namespaces. Instead of marking a variable or function static, wrap it in an unnamed namespace like this:

#include <iostream>

using namespace std;

namespace {

void f();

void f()

{

cout << "f\n";

}

}

Entities in an anonymous namespace can be accessed anywhere following their declaration in the same source file, but cannot be accessed from other source files. These semantics are the same as those obtained with the static keyword.

The extern Keyword

A related keyword, extern, seems like it should be the opposite of static, specifying external linkage for the names it precedes. It can be used that way in certain cases. For example, consts and typedefs have internal linkage by default. You can use extern to give them external linkage. However, extern has some complications. When you specify a name as extern, the compiler treats it as a declaration, not a definition. For variables, this means the compiler doesn’t allocate space for the variable. You must provide a separate definition line for the variable without the extern keyword. For example:

extern int x;

int x = 3;

Alternatively, you can initialize x in the extern line, which then serves as the declaration and definition:

extern int x = 3;

The extern in this case is not very useful, because x has external linkage by default anyway. The real use of extern is when you want to use x from another source file, FirstFile.cpp:

#include <iostream>

using namespace std;

extern int x;

int main()

{

cout << x << endl;

}

Here FirstFile.cpp uses an extern declaration so that it can use x. The compiler needs a declaration of x in order to use it in main(). If you declared x without the extern keyword, the compiler would think it’s a definition and would allocate space for x, causing the linkage step to fail (because there are now two x variables in the global scope). With extern, you can make variables globally accessible from multiple source files.

However, it is not recommended to use global variables at all. They are confusing and error-prone, especially in large programs. For similar functionality, you could use static class data members and methods.

static Variables in Functions

The final use of the static keyword in C++ is to create local variables that retain their values between exits and entrances to their scope. A static variable inside a function is like a global variable that is only accessible from that function. One common use of staticvariables is to “remember” whether a particular initialization has been performed for a certain function. For example, code that employs this technique might look something like this:

void performTask()

{

static bool initialized = false;

if (!initialized) {

cout << "initializing\n";

// Perform initialization.

initialized = true;

}

// Perform the desired task.

}

However, static variables are confusing, and there are usually better ways to structure your code so that you can avoid them. In this case, you might want to write a class in which the constructor performs the required initialization.

NOTE Avoid using stand-alone static variables. Maintain state within an object instead.

NOTE The implementation of performTask() is not thread-safe; it contains a race condition. In a multithreaded environment, you need to use atomics or other mechanisms for synchronization of multiple threads. Multithreading is discussed in detail in Chapter 23.

Order of Initialization of Nonlocal Variables

Before leaving the topic of static data members and global variables, consider the order of initialization of these variables. All global variables and static class data members in a program are initialized before main() begins. The variables in a given source file are initialized in the order they appear in the source file. For example, in the following file Demo::x is guaranteed to be initialized before y:

class Demo

{

public:

static int x;

};

int Demo::x = 3;

int y = 4;

However, C++ provides no specifications or guarantees about the initialization ordering of nonlocal variables in different source files. If you have a global variable x in one source file and a global variable y in another, you have no way of knowing which will be initialized first. Normally, this lack of specification isn’t cause for concern. However, it can be problematic if one global or static variable depends on another. Recall that initialization of objects implies running their constructors. The constructor of one global object might access another global object, assuming that it is already constructed. If these two global objects are declared in two different source files, you cannot count on one being constructed before the other, and you cannot control the order of initialization. This order may not be the same for different compilers or even different versions of the same compiler, and the order might even change by simply adding another file to your project.

WARNING Initialization order of nonlocal variables in different source files is undefined.

Order of Destruction of Nonlocal Variables

Nonlocal variables are destroyed in the reverse order they were initialized. Nonlocal variables in different source files are initialized in an undefined order, which means that the order of destruction is also undefined.

TYPES AND CASTS

The basic types in C++ are reviewed in Chapter 1, while Chapter 7 shows you how to write your own types with classes. This section explores some of the trickier aspects of types: typedefs, typedefs for function pointers, type aliases, and casts.

typedefs

A typedef provides a new name for an existing type declaration. You can think of a typedef as syntax for introducing a synonym for an existing type declaration without creating a new type. The following gives the new name IntPtr to the int* type declaration:

typedef int* IntPtr;

You can use the new type name and the definition it aliases interchangeably. For example, the following two lines are valid:

int* p1;

IntPtr p2;

Variables created with the new type name are completely compatible with those created with the original type declaration. So it is perfectly valid, given the above definitions, to write the following, because they are not just “compatible” types, they are the same type:

p1 = p2;

p2 = p1;

The most common use of typedefs is to provide manageable names when the real type declarations become too unwieldy. This situation commonly arises with templates. For example, Chapter 1 introduces the std::vector from the STL. To declare a vector of strings, you need to declare it as std::vector<std::string>. It’s a templated class, and thus requires you to specify the template parameters anytime you want to refer to the type of this vector. Templates are discussed in detail in Chapter 11. For declaring variables, specifying function parameters, and so on, you would have to write std::vector<std::string>:

void processVector(const std::vector<std::string>& vec) { /* omitted */ }

int main()

{

std::vector<std::string> myVector;

return 0;

}

With a typedef, you can create a shorter, more meaningful name:

typedef std::vector<std::string> StringVector;

void processVector(const StringVector& vec) { /* omitted */ }

int main()

{

StringVector myVector;

return 0;

}

typedefs can include the scope qualifiers. The preceding example shows this by including the scope std for StringVector.

The STL uses typedefs extensively to provide shorter names for types. For example, string is actually a typedef that looks like this:

typedef basic_string<char, char_traits<char>, allocator<char>> string;

typedefs for Function Pointers

The most convoluted use of typedefs is when defining function pointers. While function pointers in C++ are uncommon (being replaced by virtual methods), there are needs to obtain function pointers in certain cases. Perhaps the most common example of this is when obtaining a pointer to a function in a dynamic link library. The following example obtains a pointer to a function in a Microsoft Windows Dynamic Link Library (DLL). Details of Windows DLLs are outside the scope of this book on platform-independent C++, but it is so important to Windows programmers that it is worth discussing, and it is a good example to explain the details of function pointers in general.

Consider a DLL that has a function called MyFunc(). You would like to load this library only if you need to call MyFunc(). This at run-time loading of the library is done with the Windows LoadLibrary() kernel call:

HMODULE lib = ::LoadLibrary(_T("library name"));

The result of this call is what is called a “library handle” and will be NULL if there is an error. Before you can load the function from the library, you need to know the prototype for the function. Suppose the following is the prototype for MyFunc(), which returns an integer and accepts three parameters: a Boolean, an integer, and a C-style string.

int __stdcall MyFunc(bool b, int n, const char* p);

The __stdcall is a Microsoft-specific directive to specify how parameters are passed to the function and how they are cleaned up.

You can now use a typedef to define a short name (MyFuncProc) for a pointer to a function with the preceding prototype.

typedef int (__stdcall *MyFuncProc)(bool b, int n, const char* p);

Note that the typedef name MyFuncProc is embedded in the middle of the syntax. It is clear from this example that these kinds of typedefs are rather convoluted. The next section shows you a cleaner solution to this problem called type aliases.

Having successfully loaded the library and defined a short name for the function pointer, you can get a pointer to the function in the library as follows:

MyFuncProc MyProc = ::GetProcAddress(lib, "MyFunc");

If this fails, MyProc will be NULL. If it succeeds, you can call the loaded function:

MyProc(true, 3, "Hello world");

A C programmer might think that you need to dereference the function pointer before calling it as follows:

(*MyProc)(true, 3, "Hello world");

This was true decades ago, but now, every C and C++ compiler is smart enough to know how to automatically dereference the function pointer before calling it.

Type Aliases

Type aliases are easier to understand than typedefs in certain situations. For example, take the following typedef:

typedef int MyInt;

This can be written using a type alias as follows:

using MyInt = int;

This type alias feature is especially useful in cases where the typedef becomes complicated, which is the case with typedefs for function pointers as seen in the previous section. For example, the following typedef defines a type for a pointer to a function, which returns an integer and accepts a char and a double as parameters:

typedef int (*FuncType)(char, double);

This typedef is a bit convoluted because the name FuncType is somewhere in the middle of it. Using a type alias, this can be written as follows:

using FuncType = int (*)(char, double);

By reading through this section, you might think that type aliases are nothing more than easier-to-read typedefs, but there is more. The problem with typedefs becomes apparent when you want to use them with templates, but that is covered in Chapter 11 because it requires more details about templates.

Casts

C++ provides four specific casts: static_cast, dynamic_cast, const_cast, and reinterpret_cast.

The old C-style casts with () still work in C++, and are still used extensively in existing code bases. C-style casts cover all four C++ casts, so they are more error-prone because it’s not always obvious what you are trying to achieve, and you might end up with unexpected results. I strongly recommend you only use the C++ style casts in new code because they are safer and stand out better syntactically in your code.

This section describes the purposes of each C++ cast and specifies when you would use each of them.

const_cast

The const_cast is the most straightforward. You can use it to cast away const-ness of a variable. It is the only cast of the four that is allowed to cast away const-ness. Theoretically, of course, there should be no need for a const cast. If a variable is const, it should stayconst. In practice, however, you sometimes find yourself in a situation where a function is specified to take a const variable, which it must then pass to a function that takes a non-const variable. The “correct” solution would be to make const consistent in the program, but that is not always an option, especially if you are using third-party libraries. Thus, you sometimes need to cast away the const-ness of a variable, but you should only do this when you are sure the function you are calling will not modify the object, otherwise there is no other option than to restructure your program. Here is an example:

extern void ThirdPartyLibraryMethod(char* str);

void f(const char* str)

{

ThirdPartyLibraryMethod(const_cast<char*>(str));

}

static_cast

You can use the static_cast to perform explicit conversions that are supported directly by the language. For example, if you write an arithmetic expression in which you need to convert an int to a double in order to avoid integer division, use a static_cast. In this example, it’s enough to only static_cast i, because that makes one of the two operands a double, making sure C++ performs floating point division.

int i = 3;

int j = 4;

double result = static_cast<double>(i) / j;

You can also use static_cast to perform explicit conversions that are allowed because of user-defined constructors or conversion routines. For example, if class A has a constructor that takes an object of class B, you can convert a B object to an A object with astatic_cast. In most situations where you want this behavior, however, the compiler will perform the conversion automatically.

Another use for the static_cast is to perform downcasts in an inheritance hierarchy. For example:

class Base

{

public:

Base() {}

virtual ~Base() {}

};

class Derived : public Base

{

public:

Derived() {}

virtual ~Derived() {}

};

int main()

{

Base* b;

Derived* d = new Derived();

b = d; // Don't need a cast to go up the inheritance hierarchy

d = static_cast<Derived*>(b); // Need a cast to go down the hierarchy

Base base;

Derived derived;

Base& br = derived;

Derived& dr = static_cast<Derived&>(br);

return 0;

}

These casts work with both pointers and references. They do not work with objects themselves.

Note that these casts with static_cast do not perform run-time type checking. They allow you to convert any Base pointer to a Derived pointer or Base reference to a Derived reference, even if the Base really isn’t a Derived at run time. For example, the following code will compile and execute, but using the pointer d can result in potentially catastrophic failure, including memory overwrites outside the bounds of the object.

Base* b = new Base();

Derived* d = static_cast<Derived*>(b);

To perform the cast safely with run-time type checking, use the dynamic_cast explained in a following section.

static_casts are not all-powerful. You can’t static_cast pointers of one type to pointers of another unrelated type. You can’t static_cast directly objects of one type to objects of another type. You can’t static_cast a const type to a non-const type. You can’t static_castpointers to ints. Basically, you can’t do anything that doesn’t make sense according to the type rules of C++.

reinterpret_cast

The reinterpret_cast is a bit more powerful, and concomitantly less safe, than the static_cast. You can use it to perform some casts that are not technically allowed by C++ type rules, but which might make sense to the programmer in some circumstances. For example, you can cast a reference to one type to a reference to another type, even if the types are unrelated. Similarly, you can cast a pointer type to any other pointer type, even if they are unrelated by an inheritance hierarchy. This is commonly used to cast a pointer to a void* and back. A void* pointer is just a pointer to some location in memory. No type information is associated with a void* pointer. Here are some examples:

class X {};

class Y {};

int main()

{

X x;

Y y;

X* xp = &x;

Y* yp = &y;

// Need reinterpret cast for pointer conversion from unrelated classes

// static_cast doesn't work.

xp = reinterpret_cast<X*>(yp);

// No cast required for conversion from pointer to void*

void* p = xp;

// Need reinterpret cast for pointer conversion from void*

xp = reinterpret_cast<X*>(p);

// Need reinterpret cast for reference conversion from unrelated classes

// static_cast doesn't work.

X& xr = x;

Y& yr = reinterpret_cast<Y&>(x);

return 0;

}

You should be very careful with reinterpret_cast because it allows you to do conversions without performing any type checking.

WARNING In theory, you could also use reinterpret_cast to cast pointers to ints and ints to pointers, but this is considered erroneous programming, because on many platforms (especially 64-bit platforms) pointers and ints are of different sizes. For example, on a 64-bit platform, pointers are 64 bit, but integers could be 32 bit. Casting a 64-bit pointer to a 32-bit integer will result in losing 32 critical bits!

dynamic_cast

The dynamic_cast provides a run-time check on casts within an inheritance hierarchy. You can use it to cast pointers or references. dynamic_cast checks the run-time type information of the underlying object at run time. If the cast doesn’t make sense, dynamic_castreturns a null pointer (for the pointer version) or throws an std::bad_cast exception (for the reference version).

Note that the run-time type information is stored in the vtable of the object. Therefore, in order to use dynamic_cast, your classes must have at least one virtual method. If your classes don’t have a vtable, trying to use dynamic_cast will result in a compiler error, which can be a bit obscure. Microsoft VC++ for example will give the error:

error C2683: 'dynamic_cast' : 'MyClass' is not a polymorphic type.

Suppose you have the following class hierarchy:

class Base

{

public:

Base() {}

virtual ~Base() {}

};

class Derived : public Base

{

public:

Derived() {}

virtual ~Derived() {}

};

The following example shows a correct use of dynamic_cast.

Base* b;

Derived* d = new Derived();

b = d;

d = dynamic_cast<Derived*>(b);

The following dynamic_cast on a reference will cause an exception to be thrown.

Base base;

Derived derived;

Base& br = base;

try {

Derived& dr = dynamic_cast<Derived&>(br);

} catch (const bad_cast&) {

cout << "Bad cast!\n";

}

Note that you can perform the same casts down the inheritance hierarchy with a static_cast or reinterpret_cast. The difference with dynamic_cast is that it performs run-time (dynamic) type checking, while static_cast and reinterpret_cast will perform the casting even if they are erroneous.

Summary of Casts

The following table summarizes the casts you should use for different situations.

SITUATION

CAST

Remove const-ness

const_cast

Explicit cast supported by language (e.g., int to double, int to bool)

static_cast

Explicit cast supported by user-defined constructors or conversions

static_cast

Object of one class to object of another (unrelated) class

Can’t be done

Pointer-to-object of one class to pointer-to-object of another class in the same inheritance hierarchy

dynamic_cast recommended, or static_cast

Reference-to-object of one class to reference-to-object of another class in the same inheritance hierarchy

dynamic_cast recommended, or static_cast

Pointer-to-type to unrelated pointer-to-type

reinterpret_cast

Reference-to-type to unrelated reference-to-type

reinterpret_cast

Pointer-to-function to pointer-to-function

reinterpret_cast

SCOPE RESOLUTION

As a C++ programmer, you need to familiarize yourself with the concept of scope. Every name in your program, including variable, function, and class names, is in a certain scope. You create scopes with namespaces, function definitions, blocks delimited by curly braces, and class definitions. When you try to access a variable, function, or class, the name is first looked up in the nearest enclosing scope, then the next scope, and so forth, up to the global scope. Any name not in a namespace, function, block delimited by curly braces, or class is assumed to be in the global scope. If it is not found in the global scope, at that point the compiler generates an undefined symbol error.

Sometimes names in scopes hide identical names in other scopes. Other times, the scope you want is not part of the default scope resolution from that particular line in the program. If you don’t want the default scope resolution for a name, you can qualify the name with a specific scope using the scope resolution operator ::. For example, to access a static method of a class, one way is to prefix the method name with the name of the class (its scope) and the scope resolution operator. A second way is to access the static method through an object of that class. The following example demonstrates these options. The example defines a class Demo with a static get() method, a get() function that is globally scoped, and a get() function that is in the NS namespace.

class Demo

{

public:

static int get() { return 5; }

};

int get() { return 10; }

namespace NS

{

int get() { return 20; }

}

The global scope is unnamed, but you can access it specifically by using the scope resolution operator by itself (with no name prefix). The different get() functions can be called as follows. In this example, the code itself is in the main() function, which is always in the global scope:

int main()

{

auto pd = std::make_unique<Demo>();

Demo d;

std::cout << pd->get() << std::endl; // prints 5

std::cout << d.get() << std::endl; // prints 5

std::cout << NS::get() << std::endl; // prints 20

std::cout << Demo::get() << std::endl; // prints 5

std::cout << ::get() << std::endl; // prints 10

std::cout << get() << std::endl; // prints 10

return 0;

}

Note that if the namespace called NS is given as an unnamed namespace, then the following line will give an error about ambiguous name resolution, because you would have a get() defined in the global scope and another get() defined in the unnamed namespace.

std::cout << get() << std::endl;

The same error occurs if you add the following using clause right before the main() function:

using namespace NS;

C++11 / C++14

The C++11 and C++14 standards add a lot of new functionality to C++. This section describes new features of which detailed descriptions do not immediately fit elsewhere in this book.

Uniform Initialization

Before C++11, initialization of types was not always uniform. For example, take the following definition of a circle, once as a structure, once as a class:

struct CircleStruct

{

int x, y;

double radius;

};

class CircleClass

{

public:

CircleClass(int x, int y, double radius)

: mX(x), mY(y), mRadius(radius) {}

private:

int mX, mY;

double mRadius;

};

In pre-C++11, initialization of a variable of type CircleStruct and a variable of type CircleClass looks different:

CircleStruct myCircle1 = {10, 10, 2.5};

CircleClass myCircle2(10, 10, 2.5);

For the structure version you can use the {...} syntax. However, for the class version you need to call the constructor using function notation (...).

Since C++11, you can more uniformly use the {...} syntax to initialize types, as follows:

CircleStruct myCircle3 = {10, 10, 2.5};

CircleClass myCircle4 = {10, 10, 2.5};

The definition of myCircle4 will automatically call the constructor of CircleClass. The use of the equal sign is even optional, so the following is identical:

CircleStruct myCircle5{10, 10, 2.5};

CircleClass myCircle6{10, 10, 2.5};

Uniform initialization is not limited to structures and classes. You can use it to initialize anything in C++. For example, the following code initializes all four variables with the value 3:

int a = 3;

int b(3);

int c = {3}; // Uniform initialization

int d{3}; // Uniform initialization

Uniform initialization can be used to perform zero-initialization of variables; you just specify an empty set of curly braces. For example:

int e{}; // Uniform initialization, e will be 0

Using uniform initialization also prevents narrowing. C++ implicitly performs narrowing; for example:

void func(int i) { /* ... */ }

int main()

{

int x = 3.14;

func(3.14);

return 0;

}

C++ will automatically truncate 3.14 in both cases to 3 before assigning it to x or calling func(). Note that some compilers might issue a warning about this narrowing, and some compilers won’t give a warning. However, you can use uniform initialization as follows:

void func(int i) { /* ... */ }

int main()

{

int x = {3.14}; // Error or warning because narrowing

func({3.14}); // Error or warning because narrowing

return 0;

}

Now both the assignment to x and the call to func() must generate a compiler error or warning if your compiler fully conforms to the C++11 standard.

Uniform initialization can also be used on STL containers, which are discussed in depth in Chapter 16. For example, initializing a vector of strings used to require code as follows:

std::vector<std::string> myVec;

myVec.push_back("String 1");

myVec.push_back("String 2");

myVec.push_back("String 3");

With uniform initialization, this can be rewritten:

std::vector<std::string> myVec = {"String 1", "String 2", "String 3"};

Uniform initialization can also be used to initialize dynamically allocated arrays. For example:

int* pArray = new int[4]{0, 1, 2, 3};

And last but not least, it can also be used in the constructor initializer to initialize arrays that are members of a class.

class MyClass

{

public:

MyClass() : mArray{0, 1, 2, 3} {}

private:

int mArray[4];

};

Initializer Lists

Initializer lists are defined in the <initializer_list> header file. They make it easy to write functions that can accept a variable number of arguments. The difference with variable-length argument lists described later in this chapter is that all the elements in an initializer list should have the same predefined type. The following example shows how to use an initializer list:

#include <initializer_list>

using namespace std;

int makeSum(initializer_list<int> lst)

{

int total = 0;

for (const auto& value : lst) {

total += value;

}

return total;

}

The function makeSum() accepts an initializer list of integers as argument. The body of the function uses a range-based for loop to accumulate the total sum. This function can be used as follows:

int a = makeSum({1,2,3});

int b = makeSum({10,20,30,40,50,60});

Initializer lists are type-safe and define which type is allowed to be in the list. For the above makeSum() function, you could call it with a double value as follows:

int c = makeSum({1,2,3.0});

The last element is a double, which will result in a compiler error or warning.

Explicit Conversion Operators

Chapter 8 discusses the implicit conversion that can happen with single argument constructors and how to prevent the compiler from using those implicit conversions with the explicit keyword. The C++ compiler will also perform implicit conversion with custom written conversion operators.

Since C++11 it is possible to apply the explicit keyword, not only to constructors, but also to conversion operators.

To explain explicit conversion operators, you need to understand implicit conversion first. Take the following example. It defines a class IntWrapper that just wraps an integer and implements an int() conversion operator, which the compiler can use to perform implicit conversion from an IntWrapper to type int.

class IntWrapper

{

public:

IntWrapper(int i) : mInt(i) {}

operator int() const { return mInt; }

private:

int mInt;

};

The following code demonstrates this implicit conversion; iC1 will contain the value 123:

IntWrapper c(123);

int iC1 = c;

If you want, you can still explicitly tell the compiler to call the int() conversion operator as follows. iC2 will also contain the value 123.

int iC2 = static_cast<int>(c);

Since C++11 you can use the explicit keyword to prevent the compiler from performing the implicit conversion. Below is the new class definition:

class IntWrapper

{

public:

IntWrapper(int i) : mInt(i) {}

explicit operator int() const { return mInt; }

private:

int mInt;

};

Trying to compile the following lines of code with this new class definition will result in a compiler error because the int() conversion operator is marked as explicit, so the compiler cannot use it anymore to perform implicit conversions.

IntWrapper c(123);

int iC1 = c; // Error, because of explicit int() operator

Once you have an explicit conversion operator, you have to explicitly call it if you want to use it. For example:

int iC2 = static_cast<int>(c);

Attributes

Attributes are a mechanism to add optional and/or vendor-specific information into source code. Before C++11, the vendor decided how to specify that information. Examples are __attribute__, __declspec, and so on. Since C++11, there is support for attributes by using the double square brackets syntax [[attribute]].

The C++11 standard defines only two standard attributes: [[noreturn]] and [[carries_dependency]]. C++14 adds the [[deprecated]] attribute.

[[noreturn]] means that a function never returns control to the call site. Typically, the function either causes some kind of termination (process termination or thread termination) or throws an exception. For example:

[[noreturn]] void func()

{

throw 1;

}

The second attribute, [[carries_dependency]], is a rather exotic attribute and is not discussed further.

[[deprecated]] can be used to mark something as deprecated, which means you can still use it, but its use is discouraged. This attribute accepts an optional argument that can be used to explain the reason of the deprecation, for example [[deprecated("Unsafe method, please use xyz")]].

Most attributes will be vendor-specific extensions. Vendors are advised not to use attributes to change the meaning of the program, but to use them to help the compiler to optimize code or detect errors in code. Since attributes of different vendors could clash, vendors are recommended to qualify them. For example:

[[clang::noduplicate]]

User-Defined Literals

C++ has a number of standard literals that you can use in your code. For example:

· 'a': character

· "character array": zero-terminated array of characters, C-style string

· 3.14f: float floating point value

· 0xabc: hexadecimal value

C++11 allows you to define your own literals. These user-defined literals should start with an underscore and are implemented by writing literal operators. A literal operator can work in raw or cooked mode. In raw mode, your literal operator receives a sequence of characters; while in cooked mode your literal operator receives a specific interpreted type. For example, take the C++ literal 123. Your raw literal operator receives this as a sequence of characters '1', '2', '3'. Your cooked literal operator receives this as the integer 123. Another example, take the C++ literal 0x23. The raw operator receives the characters '0', 'x', '2', '3', while the cooked operator receives the integer 35. One last example, take the C++ literal 3.14. Your raw operator receives this as '3', '.', '1', '4', while your cooked operator receives the floating point value 3.14.

A cooked-mode literal operator should have:

· one parameter of type unsigned long long, long double, char, wchar_t, char16_t, or char32_t to process numeric values,

· or two parameters where the first is a character array and the second is the length of the character array, to process strings. For example: const char* str, size_t len.

As an example, the following implements a cooked literal operator for the user-defined literal _i to define a complex number literal:

std::complex<double> operator"" _i(long double d)

{

return std::complex<double>(0, d);

}

This _i literal can be used as follows:

std::complex<double> c1 = 9.634_i;

auto c2 = 1.23_i; // c2 will have as type std::complex<double>

A second example implements a cooked operator for a user-defined literal _s to define std::string literals:

std::string operator"" _s(const char* str, size_t len)

{

return std::string(str, len);

}

This literal can be used as follows:

std::string str1 = "Hello World"_s;

auto str2 = "Hello World"_s; // str2 will have as type std::string

Without the _s literal, the auto type deduction would be const char*:

auto str3 = "Hello World"; // str3 will have as type const char*

A raw-mode literal operator requires one parameter of type const char*; a zero-terminated C-style string. The following example defines the literal _i but using a raw literal operator:

std::complex<double> operator"" _i(const char* p)

{

// Implementation omitted; it requires parsing the C-style

// string and converting it to a complex number.

}

Using this raw mode operator is exactly the same as using the cooked version.

image Standard User-Defined Literals

C++14 defines the following standard user-defined literals:

· “s” for creating std::strings; for example:

auto myString = "Hello World"s;

· “h”, “min”, “s”, “ms”, “us”, “ns”, for creating std::chrono::duration time intervals, discussed in Chapter 19; for example:

auto myDuration = 42min;

· “i”, “il”, “if” for creating complex numbers complex<double>, complex<long double>, and complex<float> respectively; for example:

auto myComplexNumber = 1.3i;

HEADER FILES

Header files are a mechanism for providing an abstract interface to a subsystem or piece of code. One of the trickier parts of using headers is avoiding circular references and multiple includes of the same header file. For example, perhaps you are responsible for writing the Logger class that performs all error message logging tasks. You may end up using another class, Preferences, that keeps track of user settings. The Preferences class may in turn use the Logger class indirectly, through yet another header.

As the following code shows, the #ifndef mechanism can be used to avoid circular and multiple includes. At the beginning of each header file, the #ifndef directive checks to see if a certain key has not been defined. If the key has been defined, the compiler will skip to the matching #endif, which is usually placed at the end of the file. If the key has not been defined, the file will proceed to define the key so that a subsequent include of the same file will be skipped. This mechanism is also known as include guards.

#ifndef LOGGER_H

#define LOGGER_H

#include "Preferences.h"

class Logger

{

public:

static void setPreferences(const Preferences& prefs);

static void logError(const char* error);

};

#endif // LOGGER_H

If your compiler supports the #pragma once directive (like Microsoft Visual C++ or GCC), this can be rewritten as follows:

#pragma once

#include "Preferences.h"

class Logger

{

public:

static void setPreferences(const Preferences& prefs);

static void logError(const char* error);

};

These include guards or #pragma once directives also make sure that you don’t get duplicate definitions by including a header file multiple times. For example, suppose A.h includes Logger.h and B.h also includes Logger.h. If you have a source file called App.cpp, which includes both A.h and B.h, the compiler will not complain about a duplicate definition of the Logger class because the Logger.h header will be included only once, even though A.h and B.h both include it.

Another tool for avoiding problems with headers is forward declarations. If you need to refer to a class but you cannot include its header file (for example, because it relies heavily on the class you are writing), you can tell the compiler that such a class exists without providing a formal definition through the #include mechanism. Of course, you cannot actually use the class in the code because the compiler knows nothing about it, except that the named class will exist after everything is linked together. However, you can still make use of pointers or references to the class in your code. In the following code, the Logger class refers to the Preferences class without including its header file:

#ifndef LOGGER_H

#define LOGGER_H

class Preferences; // forward declaration

class Logger

{

public:

static void setPreferences(const Preferences& prefs);

static void logError(const char* error);

};

#endif // LOGGER_H

It’s recommended to use forward declarations as much as possible in your header files instead of including other headers. This can reduce your compilation and recompilation times, because it breaks dependencies of your header file on other headers. Of course, your implementation file needs to include the correct headers for types that you’ve forward-declared, otherwise it won’t compile.

C UTILITIES

There are a few obscure C features that are also available in C++ and which can occasionally be useful. This section examines two of these features: variable-length argument lists and preprocessor macros.

Variable-Length Argument Lists

This section explains the old C-style variable-length argument lists. You need to know how these work because you might find them in older code. However, in new code you should use variadic templates for type-safe variable-length argument lists, described in Chapter 21.

Consider the C function printf() from <cstdio>. You can call it with any number of arguments:

printf("int %d\n", 5);

printf("String %s and int %d\n", "hello", 5);

printf("Many ints: %d, %d, %d, %d, %d\n", 1, 2, 3, 4, 5);

C/C++ provides the syntax and some utility macros for writing your own functions with a variable number of arguments. These functions usually look a lot like printf(). Although you shouldn’t need this feature very often, occasionally you run into situations in which it’s quite useful. For example, suppose you want to write a quick-and-dirty debug function that prints strings to stderr if a debug flag is set, but does nothing if the debug flag is not set. Just as printf(), this function should be able to print strings with arbitrary numbers of arguments and arbitrary types of arguments. A simple implementation looks like this:

#include <cstdio>

#include <cstdarg>

bool debug = false;

void debugOut(const char* str, ...)

{

va_list ap;

if (debug) {

va_start(ap, str);

vfprintf(stderr, str, ap);

va_end(ap);

}

}

First, note that the prototype for debugOut() contains one typed and named parameter str, followed by ... (ellipses). They stand for any number and type of arguments. In order to access these arguments, you must use macros defined in <cstdarg>. You declare a variable of type va_list, and initialize it with a call to va_start. The second parameter to va_start() must be the rightmost named variable in the parameter list. All functions with variable-length argument lists require at least one named parameter. The debugOut()function simply passes this list to vfprintf() (a standard function in <cstdio>). After the call to vfprintf() returns, debugOut() calls va_end() to terminate the access of the variable argument list. You must always call va_end() after calling va_start() to ensure that the function ends with the stack in a consistent state.

You can use the function in the following way:

debug = true;

debugOut("int %d\n", 5);

debugOut("String %s and int %d\n", "hello", 5);

debugOut("Many ints: %d, %d, %d, %d, %d\n", 1, 2, 3, 4, 5);

Accessing the Arguments

If you want to access the actual arguments yourself, you can use va_arg() to do so. Unfortunately, there is no way to know what the end of the argument list is unless you provide an explicit way of doing so. For example, you can make the first parameter a count of the number of parameters. Or, in the case where you have a set of pointers, you may require the last pointer to be nullptr. There are many ways, but they are all burdensome to the programmer.

The following example demonstrates the technique where the caller specifies in the first named parameter how many arguments are provided. The function accepts any number of ints and prints them out:

void printInts(int num, ...)

{

int temp;

va_list ap;

va_start(ap, num);

for (int i = 0; i < num; ++i) {

temp = va_arg(ap, int);

cout << temp << " ";

}

va_end(ap);

cout << endl;

}

You can call printInts() as follows. Note that the first parameter specifies how many integers will follow:

printInts(5, 5, 4, 3, 2, 1);

Why You Shouldn’t Use C-Style Variable-Length Argument Lists

Accessing C-style variable-length argument lists is not very safe. There are several risks, as you can see from the printInts() function:

· You don’t know the number of parameters. In the case of printInts(), you must trust the caller to pass the right number of arguments as the first argument. In the case of debugOut(), you must trust the caller to pass the same number of arguments after the character array as there are formatting codes in the character array.

· You don’t know the types of the arguments. va_arg() takes a type, which it uses to interpret the value in its current spot. However, you can tell va_arg() to interpret the value as any type. There is no way for it to verify the correct type.

WARNING Avoid using C-style variable-length argument lists. It is preferable to pass in an std::array or vector of values, to use initializer lists described earlier in this chapter, or to use variadic templates for type-safe variable-length argument lists described in Chapter 21.

Preprocessor Macros

You can use the C++ preprocessor to write macros, which are like little functions. Here is an example:

#define SQUARE(x) ((x) * (x)) // No semicolon after the macro definition!

int main()

{

cout << SQUARE(5) << endl;

return 0;

}

Macros are a remnant from C that are quite similar to inline functions, except that they are not type-checked, and the preprocessor dumbly replaces any calls to them with their expansions. The preprocessor does not apply true function-call semantics. This behavior can cause unexpected results. For example, consider what would happen if you called the SQUARE macro with 2 + 3 instead of 5, like this:

cout << SQUARE(2 + 3) << endl;

You expect SQUARE to calculate 25, which it does. However, what if you left off some parentheses on the macro definition, so that it looks like this?

#define SQUARE(x) (x * x)

Now, the call to SQUARE(2 + 3) generates 11, not 25! Remember that the macro is dumbly expanded without regard to function-call semantics. This means that any x in the macro body is replaced by 2 + 3, leading to this expansion:

cout << (2 + 3 * 2 + 3) << endl;

Following proper order of operations, this line performs the multiplication first, followed by the additions, generating 11 instead of 25!

Macros can also have a performance impact. Suppose you call the SQUARE macro as follows:

cout << SQUARE(veryExpensiveFunctionCallToComputeNumber()) << endl;

The preprocessor replaces this with:

cout << ((veryExpensiveFunctionCallToComputeNumber()) *

(veryExpensiveFunctionCallToComputeNumber())) << endl;

Now you are calling the expensive function twice. Another reason to avoid macros.

Macros also cause problems for debugging because the code you write is not the code that the compiler sees, or that shows up in your debugger (because of the search-and-replace behavior of the preprocessor). For these reasons, you should avoid macros entirely in favor of inline functions. The details are shown here only because quite a bit of C++ code out there employs macros. You need to understand them in order to read and maintain that code.

NOTE Some compilers can output the preprocessed source to a file. You can use that file to see how the preprocessor is preprocessing your file. For example, with Microsoft VC++ you need to use the /P switch. With GCC you can use the -E switch.

SUMMARY

This chapter explained some of the aspects of C++ that generate confusion. By reading this chapter, you learned a plethora of syntax details about C++. Some of the information, such as the details of references, const, scope resolution, the specifics of the C++-style casts, and the techniques for header files, you should use often in your programs. Other information, such as the uses of static and extern, how to write C-style variable-length argument lists, and how to write preprocessor macros, is important to understand, but not information that you should put into use in your programs on a day-to-day basis.

This chapter also discussed a number of C++11 and C++14 features that don’t really fit anywhere else in the book.

The next chapter starts a discussion on templates allowing you to write generic code.