Select Operations - Basic Facilities - The C++ Programming Language (2013)

The C++ Programming Language (2013)

Part II: Basic Facilities

11. Select Operations

When someone says “I want a programming language in which I need only say what I wish done,” give him a lollipop.

– Alan Perlis

Etc. Operators

Logical Operators; Bitwise Logical Operators; Conditional Expressions; Increment and Decrement

Free Store

Memory Management; Arrays; Getting Memory Space; Overloading new

Lists

Implementation Model; Qualified Lists; Unqualified Lists

Lambda Expressions

Implementation Model; Alternatives to Lambdas; Capture; Call and Return; The Type of a Lambda

Explicit Type Conversion

Construction; Named Casts; C-Style Cast; Function-Style Cast

Advice

11.1. Etc. Operators

This section examines a mixed bag of simple operators: logical operators (&&, ||, and !), bitwise logical operators (&, |, ~, <<, and >>), conditional expressions (?:), and increment and decrement operators (++ and ––). They have little in common beyond their details not fitting elsewhere in the discussions of operators.

11.1.1. Logical Operators

The logical operators && (and), || (or), and ! (not) take operands of arithmetic and pointer types, convert them to bool, and return a bool result. The && and || operators evaluate their second argument only if necessary, so they can be used to control evaluation order (§10.3.2). For example:

while (p && !whitespace(p)) ++p;

Here, p is not dereferenced if it is the nullptr.

11.1.2. Bitwise Logical Operators

The bitwise logical operators & (and), | (or), ^ (exclusive or, xor), ~ (complement), >> (right shift), and << (left shift) are applied to objects of integral types – that is, char, short, int, long, long long and their unsigned counterparts, and bool, wchar_t, char16_t, and char32_t. A plainenum (but not an enum class) can be implicitly converted to an integer type and used as an operand to bitwise logical operations. The usual arithmetic conversions (§10.5.3) determine the type of the result.

A typical use of bitwise logical operators is to implement the notion of a small set (a bit vector). In this case, each bit of an unsigned integer represents one member of the set, and the number of bits limits the number of members. The binary operator & is interpreted as intersection, | as union, ^ as symmetric difference, and ~ as complement. An enumeration can be used to name the members of such a set. Here is a small example borrowed from an implementation of ostream:

enum ios_base::iostate {
goodbit=0, eofbit=1, failbit=2, badbit=4
};

The implementation of a stream can set and test its state like this:

state = goodbit;
//
...
if (state&(badbit|failbit)) // stream not good

The extra parentheses are necessary because & has higher precedence than |10.3).

A function that reaches the end-of-input might report it like this:

state |= eofbit;

The |= operator is used to add to the state. A simple assignment, state=eofbit, would have cleared all other bits.

These stream state flags are observable from outside the stream implementation. For example, we could see how the states of two streams differ like this:

int old = cin.rdstate(); // rdstate() returns the state
// ... use cin ...
if (cin.rdstate()^old) { // has anything changed?
// ...
}

Computing differences of stream states is not common. For other similar types, computing differences is essential. For example, consider comparing a bit vector that represents the set of interrupts being handled with another that represents the set of interrupts waiting to be handled.

Please note that this bit fiddling is taken from the implementation of iostreams rather than from the user interface. Convenient bit manipulation can be very important, but for reliability, maintainability, portability, etc., it should be kept at low levels of a system. For more general notions of a set, see the standard-library set31.4.3) and bitset34.2.2).

Bitwise logical operations can be used to extract bit-fields from a word. For example, one could extract the middle 16 bits of a 32-bit int like this:

constexpr unsigned short middle(int a)
{
static_assert(sizeof(int)==4,"unexpected int size");
static_assert(sizeof(short)==2,"unexpected short size");
return (a>>8)&0xFFFF;
}


int x = 0xFF00FF00; // assume sizeof(int)==4
short y = middle(x); // y = 0x00FF

Using fields (§8.2.7) is a convenient shorthand for such shifting and masking.

Do not confuse the bitwise logical operators with the logical operators: &&, ||, and !. The latter return true or false, and they are primarily useful for writing the test in an if-, while-, or for-statement (§9.4, §9.5). For example, !0 (not zero) is the value true, which converts to 1, whereas ~0(complement of zero) is the bit pattern all-ones, which in two’s complement representation is the value –1 .

11.1.3. Conditional Expressions

Some if-statements can conveniently be replaced by conditional-expressions. For example:

if (a <= b)
max = b;
else
max = a;

This is more directly expressed like this:

max = (a<=b) ? b : a;

The parentheses around the condition are not necessary, but I find the code easier to read when they are used.

Conditional expressions are important in that they can be used in constant expressions (§10.4).

A pair of expressions e1 and e2 can be used as alternatives in a conditional expression, c?e1:e2, if they are of the same type or if there is a common type T, to which they can both be implicitly converted. For arithmetic types, the usual arithmetic conversions (§10.5.3) are used to find that common type. For other types, either e1 must be implicitly convertible to e2’s type or vice versa. In addition, one branch may be a throw-expression (§13.5.1). For example:

void fct(int *p)
{
int i = (p) ?
*p : std::runtime_error{"unexpected nullptr};
//
...
}

11.1.4. Increment and Decrement

The ++ operator is used to express incrementing directly, rather than expressing it indirectly using a combination of an addition and an assignment. Provided lvalue has no side effects, ++lvalue means lvalue+=1, which again means lvalue=lvalue+1. The expression denoting the object to be incremented is evaluated once (only). Decrementing is similarly expressed by the –– operator.

The operators ++ and –– can be used as both prefix and postfix operators. The value of ++x is the new (that is, incremented) value of x. For example, y=++x is equivalent to y=(x=x+1). The value of x++, however, is the old value of x. For example, y=x++ is equivalent to y=(t=x,x=x+1,t), where t is a variable of the same type as x.

Like adding an int to a pointer, or subtracting it, ++ and –– on a pointer operate in terms of elements of the array into which the pointer points; p++ makes p point to the next element (§7.4.1).

The ++ and –– operators are particularly useful for incrementing and decrementing variables in loops. For example, one can copy a zero-terminated C-style string like this:

void cpy(char* p, const char* q)
{
while (*p++ =
*q++);
}

Like C, C++ is both loved and hated for enabling such terse, expression-oriented coding. Consider:

while (*p++ = *q++) ;

This is more than a little obscure to non-C programmers, but because the style of coding is not uncommon, it is worth examining more closely. Consider first a more traditional way of copying an array of characters:

int length = strlen(q);
for (int i = 0; i<=length; i++)
p[i] = q[i];

This is wasteful. The length of a zero-terminated string is found by reading the string looking for the terminating zero. Thus, we read the string twice: once to find its length and once to copy it. So we try this instead:

int i;
for (i = 0; q[i]!=0 ; i++)
p[i] = q[i];
p[i] = 0; //
terminating zero

The variable i used for indexing can be eliminated because p and q are pointers:

while (*q !=0){
*p= *q;
p++; //
point to next character
q++; // point to next character
}
*p = 0; // terminating zero

Because the post-increment operation allows us first to use the value and then to increment it, we can rewrite the loop like this:

while (*q!=0) {
*p++ = *q++;
}

*p = 0; // terminating zero

The value of *p++ = *q++ is *q. We can therefore rewrite the example like this:

while ((*p++ = *q++) != 0) { }

In this case, we don’t notice that *q is zero until we already have copied it into *p and incremented p. Consequently, we can eliminate the final assignment of the terminating zero. Finally, we can reduce the example further by observing that we don’t need the empty block and that the !=0 is redundant because the result of an integral condition is always compared to zero anyway. Thus, we get the version we set out to discover:

while (*p++ = *q++) ;

Is this version less readable than the previous versions? Not to an experienced C or C++ programmer. Is this version more efficient in time or space than the previous versions? Except for the first version that called strlen(), not really; the performance will be equivalent and often identical code will be generated.

The most efficient way of copying a zero-terminated character string is typically the standard C-style string copy function:

char* strcpy(char*, const char*); // from <string.h>

For more general copying, the standard copy algorithm (§4.5, §32.5) can be used. Whenever possible, use standard-library facilities in preference to fiddling with pointers and bytes. Standard-library functions may be inlined (§12.1.3) or even implemented using specialized machine instructions. Therefore, you should measure carefully before believing that some piece of handcrafted code outperforms library functions. Even if it does, the advantage may not exist on some other handware+compiler combination, and your alternative may give a maintainer a headache.

11.2. Free Store

A named object has its lifetime determined by its scope (§6.3.4). However, it is often useful to create an object that exists independently of the scope in which it was created. For example, it is common to create objects that can be used after returning from the function in which they were created. The operator new creates such objects, and the operator delete can be used to destroy them. Objects allocated by new are said to be “on the free store” (also, “on the heap” or “in dynamic memory”).

Consider how we might write a compiler in the style used for the desk calculator (§10.2). The syntax analysis functions might build a tree of the expressions for use by the code generator:

struct Enode {
Token_value oper;
Enode*
left;
Enode*
right;
//
...
};
Enode* expr(bool get)
{

Enode* left = term(get);

for (;;) {
switch (ts.current().kind) {
case Kind::plus:
case Kind::minus:
left = new Enode {ts.current().kind,left,term(true)};
break;
default:
return left; //
return node
}
}

}

In cases Kind::plus and Kind::minus, a new Enode is created on the free store and initialized by the value {ts.current().kind,left,term(true)}. The resulting pointer is assigned to left and eventually returned from expr().

I used the {}-list notation for specifying arguments. Alternatively, I could have used the old-style ()-list notation to specify an initializer. However, trying the = notation for initializing an object created using new results in an error:

int* p = new int = 7; // error

If a type has a default constructor, we can leave out the initializer, but built-in types are by default uninitialized. For example:

auto pc = new complex<double>; // the complex is initialized to {0,0}
auto pi = new int; // the int is uninitialized

This can be confusing. To be sure to get default initialization, use {}. For example:

auto pc = new complex<double>{}; // the complex is initialized to {0,0}
auto pi = new int{}; // the int is initialized to 0

A code generator could use the Enodes created by expr() and delete them:

void generate(Enode* n)
{
switch (n–>oper) {
case Kind::plus:

// use n
delete n; // delete an Enode from the free store
}
}

An object created by new exists until it is explicitly destroyed by delete. Then, the space it occupied can be reused by new. A C++ implementation does not guarantee the presence of a “garbage collector” that looks out for unreferenced objects and makes them available to new for reuse. Consequently, I will assume that objects created by new are manually freed using delete.

The delete operator may be applied only to a pointer returned by new or to the nullptr. Applying delete to the nullptr has no effect.

If the deleted object is of a class with a destructor (§3.2.1.2, §17.2), that destructor is called by delete before the object’s memory is released for reuse.

11.2.1. Memory Management

The main problems with free store are:

Leaked objects: People use new and then forget to delete the allocated object.

Premature deletion: People delete an object that they have some other pointer to and later use that other pointer.

Double deletion: An object is deleted twice, invoking its destructor (if any) twice.

Leaked objects are potentially a bad problem because they can cause a program to run out of space. Premature deletion is almost always a nasty problem because the pointer to the “deleted object” no longer points to a valid object (so reading it may give bad results) and may indeed point to memory that has been reused for another object (so writing to it may corrupt an unrelated object). Consider this example of very bad code:

int* p1 = new int{99};
int*
p2 = p1; // potential trouble
delete p1; // now p2 doesn't point to a valid object
p1 = nullptr; // gives a false sense of safety
char* p3 = new char{'x'}; // p3 may now point to the memory pointed to by p2
*p2 = 999; // this may cause trouble
cout << *p3 << '\n'; // may not print x

Double deletion is a problem because resource managers typically cannot track what code owns a resource. Consider:

void sloppy() // very bad code
{
int*
p = new int[1000]; // acquire memory
// ... use *p ...
delete[] p; // release memory

// ... wait a while ...

delete[] p; // but sloppy() does not own *p
}

By the second delete[], the memory pointed to by* p may have been reallocated for some other use and the allocator may get corrupted. Replace int with string in that example, and we’ll see string’s destructor trying to read memory that has been reallocated and maybe overwritten by other code, and using what it read to try to delete memory. In general, a double deletion is undefined behavior and the results are unpredictable and usually disastrous.

The reason people make these mistakes is typically not maliciousness and often not even simple sloppiness; it is genuinely hard to consistently deallocate every allocated object in a large program (once and at exactly the right point in a computation). For starters, analysis of a localized part of a program will not detect these problems because an error usually involves several separate parts.

As alternatives to using “naked” news and deletes, I can recommend two general approaches to resource management that avoid such problems:

[1] Don’t put objects on the free store if you don’t have to; prefer scoped variables.

[2] When you construct an object on the free store, place its pointer into a manager object (sometimes called a handle) with a destructor that will destroy it. Examples are string, vector and all the other standard-library containers, unique_ptr5.2.1, §34.3.1), and shared_ptr5.2.1, §34.3.2). Wherever possible, have that manager object be a scoped variable. Many classical uses of free store can be eliminated by using move semantics (§3.3, §17.5.2) to return large objects represented as manager objects from functions.

This rule [2] is often referred to as RAII (“Resource Acquisition Is Initialization”; §5.2, §13.3) and is the basic technique for avoiding resource leaks and making error handling using exceptions simple and safe.

The standard-library vector is an example of these techniques:

void f(const string& s)
{
vector<char> v;
for (auto c : s)
v.push_back(c);
//
...
}

The vector keeps its elements on the free store, but it handles all allocations and deallocations itself. In this example, push_back() does news to acquire space for its elements and deletes to free space that it no longer needs. However, the users of vector need not know about those implementation details and will just rely on vector not leaking.

The Token_stream from the calculator example is an even simpler example (§10.2.2). There, a user can use new and hand the resulting pointer to a Token_stream to manage:

Token_stream ts{new istringstream{some_string}};

We do not need to use the free store just to get a large object out of a function. For example:

string reverse(const string& s)
{
string ss;
for (int i=s.size()–1; 0<=i; ––i)
ss.push_back(s[i]);
return ss;
}

Like vector, a string is really a handle to its elements. So, we simply move the ss out of reverse() rather than copying any elements (§3.3.2).

The resource management “smart pointers” (e.g., unique_ptr and smart_ptr) are a further example of these ideas (§5.2.1, §34.3.1). For example:

void f(int n)
{
int*
p1 = new int[n]; // potential trouble
unique_ptr<int[]> p2 {new int[n]};
// ...
if (n%2) throw runtime_error("odd");
delete[] p1; //
we may never get here
}

For f(3) the memory pointed to by p1 is leaked, but the memory pointed to by p2 is correctly and implicitly deallocated.

My rule of thumb for the use of new and delete is “no naked news”; that is, new belongs in constructors and similar operations, delete belongs in destructors, and together they provide a coherent memory management strategy. In addition, new is often used in arguments to resource handles.

If everything else fails (e.g., if someone has a lot of old code with lots of undisciplined use of new), C++ offers a standard interface to a garbage collector (§34.5).

11.2.2. Arrays

Arrays of objects can also be created using new. For example:

char* save_string(const char* p)
{
char*
s = new char[strlen(p)+1];
strcpy(s,p); //
copy from p to s
return s;
}


int main(int argc, char* argv[])
{
if (argc < 2) exit(1);
char*
p = save_string(argv[1]);
//
...
delete[] p;
}

The “plain” operator delete is used to delete individual objects; delete[] is used to delete arrays.

Unless you really must use a char* directly, the standard-library string can be used to simplify the save_string():

string save_string(const char* p)
{
return string{p};
}
int main(int argc, char*
argv[])
{
if (argc < 2) exit(1);
string s = save_string(argv[1]);
//
...
}

In particular, the new[] and the delete[] vanished.

To deallocate space allocated by new, delete and delete[] must be able to determine the size of the object allocated. This implies that an object allocated using the standard implementation of new will occupy slightly more space than a static object. At a minimum, space is needed to hold the object’s size. Usually two or more words per allocation are used for free-store management. Most modern machines use 8-byte words. This overhead is not significant when we allocate many objects or large objects, but it can matter if we allocate lots of small objects (e.g., ints or Points) on the free store.

Note that a vector4.4.1, §31.4) is a proper object and can therefore be allocated and deallocated using plain new and delete. For example:

void f(int n)
{
vector<int>*
p = new vector<int>(n); // individual object
int* q = new int[n]; // array
// ...
delete p;
delete[] q;
}

The delete[] operator may be applied only to a pointer to an array returned by new of an array or to the null pointer (§7.2.2). Applying delete[] to the null pointer has no effect.

However, do not use new to create local objects. For example:

void f1()
{

X* p =new X;
//
... use *p ...
delete p;
}

That’s verbose, inefficient, and error-prone (§13.3). In particular, a return or an exception thrown before the delete will cause a memory leak (unless even more code is added). Instead, use a local variable:

void f2()
{

X x;
//
... use x ...
}

The local variable x is implicitly destroyed upon exit from f2.

11.2.3. Getting Memory Space

The free-store operators new, delete, new[], and delete[] are implemented using functions presented in the <new> header:

void* operator new(size_t); // allocate space for individual object
void operator delete(void* p); // if (p) deallocate space allocated using operator new()
void* operator new[](size_t); // allocate space for array
void operator delete[](void* p); // if (p) deallocate space allocated using operator new[]()

When operator new needs to allocate space for an object, it calls operator new() to allocate a suitable number of bytes. Similarly, when operator new needs to allocate space for an array, it calls operator new[]().

The standard implementations of operator new() and operator new[]() do not initialize the memory returned.

The allocation and deallocation functions deal in untyped and uninitialized memory (often called “raw memory”), as opposed to typed objects. Consequently, they take arguments or return values of type void*. The operators new and delete handle the mapping between this untyped-memory layer and the typed-object layer.

What happens when new can find no store to allocate? By default, the allocator throws a standard-library bad_alloc exception (for an alternative, see §11.2.4.1). For example:

void f()
{
vector<char*
>v;
try {
for (;;) {
char*
p = new char[10000]; // acquire some memory
v.push_back(p); // make sure the new memory is referenced
p[0] = 'x'; // use the new memory
}
}
catch(bad_alloc) {
cerr << "Memory exhausted!\n";
}
}

However much memory we have available, this will eventually invoke the bad_alloc handler. Please be careful: the new operator is not guaranteed to throw when you run out of physical main memory. So, on a system with virtual memory, this program can consume a lot of disk space and take a long time doing so before the exception is thrown.

We can specify what new should do upon memory exhaustion; see §30.4.1.3.

In addition to the functions defined in <new>, a user can define operator new(), etc., for a specific class (§19.2.5). Class members operator new(), etc., are found and used in preference to the ones from <new> according to the usual scope rules.

11.2.4. Overloading new

By default, operator new creates its object on the free store. What if we wanted the object allocated elsewhere? Consider a simple class:

class X {
public:

X(int);
//
...
};

We can place objects anywhere by providing an allocator function (§11.2.3) with extra arguments and then supplying such extra arguments when using new:

void* operator new(size_t, void *p) { return p; } // explicit placement operator

void* buf = reinterpret_cast<void* =>(0xF00F); // significant address
X* p2 = new(buf) X; // construct an X at buf;
// invokes: operator new(sizeof(X),buf)

Because of this usage, the new(buf) X syntax for supplying extra arguments to operator new() is known as the placement syntax. Note that every operator new() takes a size as its first argument and that the size of the object allocated is implicitly supplied (§19.2.5). The operator new()used by the new operator is chosen by the usual argument matching rules (§12.3); every operator new() has a size_t as its first argument.

The “placement” operator new() is the simplest such allocator. It is defined in the standard header <new>:

void* operator new (size_t sz, void* p) noexcept; // place object of size sz at p
void* operator new[](size_t sz, void* p) noexcept; // place object of size sz at p

void operator delete (void* p, void*) noexcept; // if (p) make *p invalid
void operator delete[](void* p, void*) noexcept; // if (p) make *p invalid

The “placement delete” operators do nothing except possibly inform a garbage collector that the deleted pointer is no longer safely derived (§34.5).

The placement new construct can also be used to allocate memory from a specific arena:

class Arena {
public:
virtual void*
alloc(size_t) =0;
virtual void free(void*) =0;
//
...
};

void* operator new(size_t sz, Arena* a)
{
return a–>alloc(sz);
}

Now objects of arbitrary types can be allocated from different Arenas as needed. For example:

extern Arena* Persistent;
extern Arena*
Shared;

void g(int i)
{

X* p = new(Persistent) X(i); // X in persistent storage
X* q = new(Shared) X(i); // X in shared memory
// ...
}

Placing an object in an area that is not (directly) controlled by the standard free-store manager implies that some care is required when destroying the object. The basic mechanism for that is an explicit call of a destructor:

void destroy(X* p, Arena* a)
{
p–>~X(); //
call destructor
a–>free(p); // free memory
}

Note that explicit calls of destructors should be avoided except in the implementation of resource management classes. Even most resource handles can be written using new and delete. However, it would be hard to implement an efficient general container along the lines of the standard-library vector4.4.1, §31.3.3) without using explicit destructor calls. A novice should think thrice before calling a destructor explicitly and also should ask a more experienced colleague before doing so.

See §13.6.1 for an example of how placement new can interact with exception handling.

There is no special syntax for placement of arrays. Nor need there be, since arbitrary types can be allocated by placement new. However, an operator delete() can be defined for arrays (§11.2.3).

11.2.4.1. nothrow new

In programs where exceptions must be avoided (§13.1.5), we can use nothrow versions of new and delete. For example:

void f(int n)
{
int*
p = new(nothrow) int[n]; // allocate n ints on the free store
if (p==nullptr) {// no memory available
// ... handle allocation error ...
}
//
...
operator delete(nothrow,p); // deallocate *p
}

That nothrow is the name of an object of the standard-library type nothrow_t that is used for disambiguation; nothrow and nothrow_t are declared in <new>.

The functions implementing this are found in <new>:

void* operator new(size_t sz, const nothrow_t&) noexcept; // allocate sz bytes;
// return nullptr if allocation failed
void operator delete(void* p, const nothrow_t&) noexcept; // deallocate space allocated by new

void* operator new[](size_t sz, const nothrow_t&) noexcept; // allocate sz bytes;
// return nullptr if allocation failed
void operator delete[](void* p, const nothrow_t&) noexcept; // deallocate space allocated by new

These operator new functions return nullptr, rather than throwing bad_alloc, if there is not sufficient memory to allocate.

11.3. Lists

In addition to their use for initializing named variables (§6.3.5.2), {}-lists can be used as expressions in many (but not all) places. They can appear in two forms:

[1] Qualified by a type, T{...}, meaning “create an object of type T initialized by T{...}”; §11.3.2

[2] Unqualified {...}, for which the the type must be determined from the context of use; §11.3.3

For example:

struct S { int a, b; };
struct SS { double a, b; };


void f(S); // f() takes an S

void g(S);
void g(SS); //
g() is overloaded

void h()
{
f({1,2}); //
OK: call f(S{1,2})

g({1,2}); // error: ambiguous
g(S{1,2}); // OK: call g(S)
g(SS{1,2}); // OK: call g(SS)
}

As in their use for initializing named variables (§6.3.5), lists can have zero, one, or more elements. A {}-list is used to construct an object of some type, so the number of elements and their types must be what is required to construct an object of that type.

11.3.1. Implementation Model

The implementation model for {}-lists comes in three parts:

• If the {}-list is used as constructor arguments, the implementation is just as if you had used a ()-list. List elements are not copied except as by-value constructor arguments.

• If the {}-list is used to initialize the elements of an aggregate (an array or a class without a constructor), each list element initializes an element of the aggregate. List elements are not copied except as by-value arguments to aggregate element constructors.

• If the {}-list is used to construct an initializer_list object each list element is used to initialize an element of the underlying array of the initializer_list. Elements are typically copied from the initializer_list to wherever we use them.

Note that this is the general model that we can use to understand the semantics of a {}-list; a compiler may apply clever optimizations as long as the meaning is preserved.

Consider:

vector<double> v = {1, 2, 3.14};

The standard-library vector has an initializer-list constructor (§17.3.4), so the initializer list {1,2,3.14} is interpreted as a temporary constructed and used like this:

const double temp[] = {double{1}, double{2}, 3.14 } ;
const initializer_list<double> tmp(temp,sizeof(temp)/sizeof(double));
vector<double> v(tmp);

That is, the compiler constructs an array containing the initializers converted to the desired type (here, double). This array is passed to vectors initializer-list constructor as an initializer_list. The initializer-list constructor then copies the values from the array into its own data structure for elements. Note that an initializer_list is a small object (probably two words), so passing it by value makes sense.

The underlying array is immutable, so there is no way (within the standard’s rules) that the meaning of a {}-list can change between two uses. Consider:

void f()
{
initializer_list<int> lst {1,2,3};
cout <<
*lst.begin() << '\n';
*lst.begin() = 2; // error: lst is immutable
cout << *lst.begin() << '\n';
}

In particular, having a {}-list be immutable implies that a container taking elements from it must use a copy operation, rather than a move operation.

The lifetime of a {}-list (and its underlying array) is determined by the scope in which it is used (§6.4.2). When used to initialize a variable of type initializer_list<T>, the list lives as long as the variable. When used in an expression (including as an initializer to a variable of some other type, such as vector<T>), the list is destroyed at the end of its full expression.

11.3.2. Qualified Lists

The basic idea of initializer lists as expressions is that if you can initialize a variable x using the notation

T x {v};

then you can create an object with the same value as an expression using T{v} or new T{v}. Using new places the object on the free store and returns a pointer to it, whereas “plain T{v}” makes a temporary object in the local scope (§6.4.2). For example:

struct S { int a, b; };

void f()
{

S v {7,8}; // direct initialization of a variable
v = S{7,8}; // assign using qualified list
S* p = new S{7,8}; // construct on free store using qualified list
}

The rules constructing an object using a qualified list are those of direct initialization (§16.2.6).

One way of looking at a qualified initializer list with one element is as a conversion from one type to another. For example:

template<class T>
T square(T x)
{
return x*x;
}

void f(int i)
{
double d = square(double{i});
complex<double> z = square(complex<double>{i});
}

That idea is explored further in §11.5.1.

11.3.3. Unqualified Lists

A unqualified list is used where an expected type is unambiguously known. It can be used as an expression only as:

• A function argument

• A return value

• The right-hand operand of an assignment operator (=, +=*, =, etc.)

• A subscript

For example:

int f(double d, Matrix& m)
{
int v {7}; //
initializer (direct initialization)
int v2 = {7}; // initializer (copy initialization)
int v3 = m[{2,3}]; // assume m takes value pairs as subscripts

v = {8}; // right-hand operand of assignment
v += {88}; // right-hand operand of assignment
{v} = 9; // error: not left-hand operand of assignment
v = 7+{10}; // error: not an operand of a non-assignment operator
f({10.0}); // function argument
return {11}; // return value
}

The reason that an unqualified list is not allowed on the left-hand side of assignments is primarily that the C++ grammar allows { in that position for compound statements (blocks), so that readability would be a problem for humans and ambiguity resolution would be tricky for compilers. This is not an insurmountable problem, but it was decided not to extend C++ in that direction.

When used as the initializer for a named object without the use of a = (as for v above), an unqualified {}-list performs direct initialization (§16.2.6). In all other cases, it performs copy initialization (§16.2.6). In particular, the otherwise redundant = in an initializer restricts the set of initializations that can be performed with a given {}-list.

The standard-library type initializer_list<T> is used to handle variable-length {}-lists (§12.2.3). Its most obvious use is to allow initializer lists for user-defined containers (§3.2.1.3), but it can also be used directly; for example:

int high_value(initializer_list<int> val)
{
int high = numeric_traits<int>lowest();
if (val.size()==0) return high;

for (auto x : val)
if (x>high) high = x;

return high;
}

int v1 = high_value({1,2,3,4,5,6,7});
int v2 = high_value({–1,2,v1,4,–9,20,v1});

A {}-list is the simplest way of dealing with homogeneous lists of varying lengths. However, beware that zero elements can be a special case. If so, that case should be handled by a default constructor (§17.3.3).

The type of a {}-list can be deduced (only) if all elements are of the same type. For example:

auto x0 = {}; // error (no element type)
auto x1 = {1}; // initializer_list<int>
auto x2 = {1,2}; // initializer_list<int>
auto x3 = {1,2,3}; // initializer_list<int>
auto x4 = {1,2.0}; // error: nonhomogeneous list

Unfortunately, we do not deduce the type of an unqualified list for a plain template argument. For example:

template<typename T>
void f(T);


f({}); // error: type of initializer is unknown
f({1}); // error: an unqualified list does not match "plain T"
f({1,2}); // error: an unqualified list does not match "plain T"
f({1,2,3}); // error: an unqualified list does not match "plain T"

I say “unfortunately” because this is a language restriction, rather than a fundamental rule. It would be technically possible to deduce the type of those {}-lists as initializer_list<int>, just like we do for auto initializers.

Similarly, we do not deduce the element type of a container represented as a template. For example:

template<class T>
void f2(const vector<T>&);


f2({1,2,3}); // error: cannot deduce T
f2({"Kona","Sidney"}); // error: cannot deduce T

This too is unfortunate, but it is a bit more understandable from a language-technical point of view: nowhere in those calls does it say vector. To deduce T the compiler would first have to decide that the user really wanted a vector and then look into the definition of vector to see if it has a constructor that accepts {1,2,3}. In general, that would require an instantiation of vector26.2). It would be possible to handle that, but it could be costly in compile time, and the opportunities for ambiguities and confusion if there were many overloaded versions of f2() are reasons for caution. To call f2(), be more specific:

f2(vector<int>{1,2,3}); // OK
f2(vector<string>{"Kona","Sidney"}); // OK

11.4. Lambda Expressions

A lambda expression, sometimes also referred to as a lambda function or (strictly speaking incorrectly, but colloquially) as a lambda, is a simplified notation for defining and using an anonymous function object. Instead of defining a named class with an operator(), later making an object of that class, and finally invoking it, we can use a shorthand. This is particularly useful when we want to pass an operation as an argument to an algorithm. In the context of graphical user interfaces (and elsewhere), such operations are often referred to as callbacks. This section focuses on technical aspects of lambdas; examples and techniques for the use of lambdas can be found elsewhere (§3.4.3, §32.4, §33.5.2).

A lambda expression consists of a sequence of parts:

• A possibly empty capture list, specifying what names from the definition environment can be used in the lambda expression’s body, and whether those are copied or accessed by reference. The capture list is delimited by []11.4.3).

• An optional parameter list, specifying what arguments the lambda expression requires. The parameter list is delimited by ()11.4.4).

• An optional mutable specifier, indicating that the lambda expression’s body may modify the state of the lambda (i.e., change the lambda’s copies of variables captured by value) (§11.4.3.4).

• An optional noexcept specifier.

• An optional return type declaration of the form –> type11.4.4).

• A body, specifying the code to be executed. The body is delimited by {}11.4.3).

The details of passing arguments, returning results, and specifying the body are those of functions and are presented in Chapter 12. The notion of “capture” of local variables is not provided for functions. This implies that a lambda can act as a local function even though a function cannot.

11.4.1. Implementation Model

Lambda expressions can be implemented in a variety of ways, and there are some rather effective ways of optimizing them. However, I find it useful to understand the semantics of a lambda by considering it a shorthand for defining and using a function object. Consider a relatively simple example:

void print_modulo(const vector<int>& v, ostream& os, int m)
// output v[i] to os if v[i]%m==0
{
for_each(begin(v),end(v),

[&os,m](int x) { if (x%m==0) os << x << '\n'; }
);
}

To see what this means, we can define the equivalent function object:

class Modulo_print {
ostream& os; //
members to hold the capture list
int m;
public:

Modulo_print(ostream& s, int mm) :os(s), m(mm) {} // capture
void operator()(int x) const
{ if (x%m==0) os << x << '\n'; }
};

The capture list, [&os,m], becomes two member variables and a constructor to initialize them. The & before os means that we should store a reference, and the absence of a & for m means that we should store a copy. This use of & mirrors its use in function argument declarations.

The body of the lambda simply becomes the body of the operator()(). Since the lambda doesn’t return a value, the operator()() is void. By default, operator()() is const, so that the lambda body doesn’t modify the captured variables. That’s by far the most common case. Should you want to modify the state of a lambda from its body, the lambda can be declared mutable11.4.3.4). This corresponds to an operator()() not being declared const.

An object of a class generated from a lambda is called a closure object (or simply a closure). We can now write the original function like this:

void print_modulo(const vector<int>& v, ostream& os, int m)
// output v[i] to os if v[i]%m==0
{
for_each(begin(v),end(v),Modulo_print{os,m});
}

If a lambda potentially captures every local variable by reference (using the capture list [&]), the closure may be optimized to simply contain a pointer to the enclosing stack frame.

11.4.2. Alternatives to Lambdas

That final version of print_modulo() is actually quite attractive, and naming nontrivial operations is generally a good idea. A separately defined class also leaves more room for comments than does a lambda embedded in some argument list.

However, many lambdas are small and used only once. For such uses, the realistic equivalent involves a local class defined immediately before its (only) use. For example:

void print_modulo(const vector<int>& v, ostream& os, int m)
// output v[i] to os if v[i]%m==0
{
class Modulo_print
{
ostream& os; //
members to hold the capture list
int m;
public:

Modulo_print (ostream& s, int mm) :os(s), m(mm) {} // capture
void operator()(int x) const
{ if (x%m==0) os << x << '\n'; }
};


for_each(begin(v),end(v),Modulo_print{os,m});
}

Compared to that, the version using the lambda is a clear winner. If we really want a name, we can just name the lambda:

void print_modulo(const vector<int>& v, ostream& os, int m)
// output v[i] to os if v[i]%m==0
{
auto Modulo_print = [&os,m] (int x) { if (x%m==0) os << x << '\n'; };

for_each(begin(v),end(v),Modulo_print);
}

Naming the lambda is often a good idea. Doing so forces us to consider the design of the operation a bit more carefully. It also simplifies code layout and allows for recursion (§11.4.5).

Writing a for-loop is an alternative to using a lambda with a for_each(). Consider:

void print_modulo(const vector<int>& v, ostream& os, int m)
// output v[i] to os if v[i]%m==0
{
for (auto x : v)
if (x%m==0) os << x << '\n';
}

Many would find this version much clearer than any of the lambda versions. However, for_each is a rather special algorithm, and vector<int> is a very specific container. Consider generalizing print_modulo() to handle arbitrary containers:

template<class C>
void print_modulo(const C& v, ostream& os, int m)

// output v[i] to os if v[i]%m==0
{
for (auto x : v)
if (x%m==0) os << x << '\n';
}

This version works nicely for a map. The C++ range-for-statement specifically caters to the special case of traversing a sequence from its beginning to its end. The STL containers make such traversals easy and general. For example, using a for-statement to traverse a map gives a depth-first traversal. How would we do a breadth-first traversal? The for-loop version of print_modulo() is not amenable to change, so we have to rewrite it to an algorithm. For example:

template<class C>
void print_modulo(const C& v, ostream& os, int m)

// output v[i] to os if v[i]%m==0
{
breadth_first(begin(v),end(v),

[&os,m](int x) { if (x%m==0) os << x << '\n'; }
);
}

Thus, a lambda can be used as “the body” for a generalized loop/traversal construct represented as an algorithm. Using for_each rather than breadth_first would give depth-first traversal.

The performance of a lambda as an argument to a traversal algorithm is equivalent (typically identical) to that of the equivalent loop. I have found that to be quite consistent across implementations and platforms. The implication is that we have to base our choice between “algorithm plus lambda” and “for-statement with body” on stylistic grounds and on estimates of extensibility and maintainability.

11.4.3. Capture

The main use of lambdas is for specifying code to be passed as arguments. Lambdas allow that to be done “inline” without having to name a function (or function object) and use it elsewhere. Some lambdas require no access to their local environment. Such lambdas are defined with the empty lambda introducer []. For example:

void algo(vector<int>& v)
{
sort(v.begin(),v.end()); //
sort values
// ...
sort(v.begin(),v.end(),[](int x, int y) { return abs(x)<abs(y); }); // sort absolute values
// ...
}

If we want to access local names, we have to say so or get an error:

void f(vector<int>& v)
{
bool sensitive = true;
//
...
sort(v.begin(),v.end(),
[](int x, int y) { return sensitive ? x<y : abs(x)<abs(y); } // error: can't access sensitive
);
}

I used the lambda introducer []. This is the simplest lambda introducer and does not allow the lambda to refer to names in the calling environment. The first character of a lambda expression is always [. A lambda introducer can take various forms:

[]: an empty capture list. This implies that no local names from the surrounding context can be used in the lambda body. For such lambda expressions, data is obtained from arguments or from nonlocal variables.

[&]: implicitly capture by reference. All local names can be used. All local variables are accessed by reference.

[=]: implicitly capture by value. All local names can be used. All names refer to copies of the local variables taken at the point of call of the lambda expression.

[capture-list]: explicit capture; the capture-list is the list of names of local variables to be captured (i.e., stored in the object) by reference or by value. Variables with names preceded by & are captured by reference. Other variables are captured by value. A capture list can also containthis and names followed by ... as elements.

[&, capture-list]: implicitly capture by reference all local variables with names not mentioned in the list. The capture list can contain this. Listed names cannot be preceded by &. Variables named in the capture list are captured by value.

[=, capture-list]: implicitly capture by value all local variables with names not mentioned in the list. The capture list cannot contain this. The listed names must be preceded by &. Variables named in the capture list are captured by reference.

Note that a local name preceded by & is always captured by reference and a local name not preceded by & is always captured by value. Only capture by reference allows modification of variables in the calling environment.

The capture-list cases are used for fine-grained control over what names from the call environment are used and how. For example:

void f(vector<int>& v)
{
bool sensitive = true;
//
...
sort(v.begin(),v.end()
[sensitive](int x, int y) { return sensitive ? x<y : abs(x)<abs(y); }
);
}

By mentioning sensitive in the capture list, we make it accessible from within the lambda. By not specifying otherwise, we ensure that the capture of sensitive is done “by value”; just as for argument passing, passing a copy is the default. Had we wanted to capture sensitive “by reference,” we could have said so by adding a & before sensitive in the capture list: [&sensitive].

The choice between capturing by value and by reference is basically the same as the choice for function arguments (§12.2). We use a reference if we need to write to the captured object or if it is large. However, for lambdas, there is the added concern that a lambda might outlive its caller (§11.4.3.1). When passing a lambda to another thread, capturing by value ([=]) is typically best: accessing another thread’s stack through a reference or a pointer can be most disruptive (to performance or correctness), and trying to access the stack of a terminated thread can lead to extremely difficult-to-find errors.

If you need to capture a variadic template (§28.6) argument, use .... For example:

template<typename ... Var>
void algo(int s, Var... v)
{
auto helper = [&s,&v...] { return s*(h1(v...)+h2(v...)); }
//
...
}

Beware that is it easy to get too clever about capture. Often, there is a choice between capture and argument passing. When that’s the case, capture is usually the least typing but has the greatest potential for confusion.

11.4.3.1. Lambda and Lifetime

A lambda might outlive its caller. This can happen if we pass a lambda to a different thread or if the callee stores away the lambda for later use. For example:

void setup(Menu& m)
{
// ...
Point p1, p2, p3;
//
compute positions of p1, p2, and p3
m.add("draw triangle",[&]{ m.draw(p1,p2,p3); }); // probable disaster
// ...
}

Assuming that add() is an operation that adds a (name,action) pair to a menu and that the draw() operation makes sense, we are left with a time bomb: the setup() completes and later – maybe minutes later – a user presses the draw triangle button and the lambda tries to access the long-gone local variables. A lambda that wrote to a variable caught by reference would be even worse in that situation.

If a lambda might outlive its caller, we must make sure that all local information (if any) is copied into the closure object and that values are returned through the return mechanism (§12.1.4) or through suitable arguments. For the setup() example, that is easily done:

m.add("draw triangle",[=]{m.draw(p1,p2,p3);});

Think of the capture list as the initializer list for the closure object and [=] and [&] as short-hand notation (§11.4.1).

11.4.3.2. Namespace Names

We don’t need to “capture” namespace variables (including global variables) because they are always accessible (provided they are in scope). For example:

template<typename U, typename V>
ostream& operator<<(ostream& os, const pair<U,V>& p)
{
return os << '{' << p.first << ',' << p.second << '}';
}


void print_all(const map<string,int>& m, const string& label)
{
cout << label << ":\n{\n";
for_each(m.begin(),m.end(),

[](const pair<string,int>& p) { cout << p << '\n'; }
);
cout << "}\n";
}

Here, we don’t need to capture cout or the output operator for pair.

11.4.3.3. Lambda and this

How do we access members of a class object from a lambda used in a member function? We can include class members in the set of names potentially captured by adding this to the capture list. This is used when we want to use a lambda in the implementation of a member function. For example, we might have a class for building up requests and retrieving results:

class Request {
function<map<string,string>(const map<string,string>&)> oper; //
operation
map<string,string> values; // arguments
map<string,string> results; // targets
public:
Request(const string& s); // parse and store request
void execute()
{

[this](){results=oper(values);} // do oper to values yielding results
}
};

Members are always captured by reference. That is, [this] implies that members are accessed through this rather than copied into the lambda. Unfortunately, [this] and [=] are incompatible. This implies that incautious use can lead to race conditions in multi-threaded programs (§42.4.6).

11.4.3.4. mutable Lambdas

Usually, we don’t want to modify the state of the function object (the closure), so by default we can’t. That is, the operator()() for the generated function object (§11.4.1) is a const member function. In the unlikely event that we want to modify the state (as opposed to modifying the state of some variable captured by reference; §11.4.3), we can declare the lambda mutable. For example:

void algo(vector<int>& v)
{
int count = v.size();
std::generate(v.begin(),v.end(),

[count]()mutable{ return ––count; }
);
}

The ––count decrements the copy of v’s size stored in the closure.

11.4.4. Call and Return

The rules for passing arguments to a lambda are the same as for a function (§12.2), and so are the rules for returning results (§12.1.4). In fact, with the exception of the rules for capture (§11.4.3) most rules for lambdas are borrowed from the rules for functions and classes. However, two irregularities should be noted:

[1] If a lambda expression does not take any arguments, the argument list can be omitted. Thus, the minimal lambda expression is []{}.

[2] A lambda expression’s return type can be deduced from its body. Unfortunately, that is not also done for a function.

If a lambda body does not have a return-statement, the lambda’s return type is void. If a lambda body consists of just a single return-statement, the lambda’s return type is the type of the return’s expression. If neither is the case, we have to explicitly supply a return type. For example:

void g(double y)
{

[&]{ f(y);} // return type is void
auto z1 = [=](int x){ return x+y; } // return type is double
auto z2 = [=,y]{ if (y) return 1; else return 2; } // error: body too complicated
// for return type deduction
auto z3 =[y]() { return 1 : 2; } // return type is int
auto z4 = [=,y]()–>int { if (y) return 1; else return 2; } // OK: explicit return type
}

When the suffix return type notation is used, we cannot omit the argument list.

11.4.5. The Type of a Lambda

To allow for optimized versions of lambda expressions, the type of a lambda expression is not defined. However, it is defined to be the type of a function object in the style presented in §11.4.1. This type, called the closure type, is unique to the lambda, so no two lambdas have the same type. Had two lambdas had the same type, the template instantiation mechanism might have gotten confused. A lambda is of a local class type with a constructor and a const member function operator()(). In addition to using a lambda as an argument, we can use it to initialize a variable declaredauto or std::function<R(AL)> where R is the lambda’s return type and AL is its argument list of types (§33.5.3).

For example, I might try to write a lambda to reverse the characters in a C-style string:

auto rev = [&rev](char* b, char* e)
{ if (1<e–b) { swap(*b,*––e); rev(++b,e); } }; // error

However, that’s not possible because I cannot use an auto variable before its type has been deduced. Instead, I can introduce a name and then use it:

void f(string& s1, string& s2)
{
function<void(char*
b, char* e)> rev =
[&](char* b, char* e) { if (1<e–b) { swap( *b, *––e); rev(++b,e); } };

rev(&s1[0],&s1[0]+s1.size());
rev(&s2[0],&s2[0]+s2.size());
}

Now, the type of rev is specified before it is used.

If we just want to name a lambda, rather than using it recursively, auto can simplify things:

void g(vector<string>& vs1, vector<string>& vs2)
{
auto rev = [&](char*
b, char* e) { while (1<e–b) swap( *b++,* ––e); };

rev(&s1[0],&s1[0]+s1.size());
rev(&s2[0],&s2[0]+s2.size());
}

A lambda that captures nothing can be assigned to a pointer to function of an appropriate type. For example:

double (*p1)(double) = [](double a) { return sqrt(a); };
double (*p2)(double) = [&](double a) { return sqrt(a); }; //
error: the lambda captures
double (*p3)(int) = [](int a) { return sqrt(a); }; // error: argument types do not match

11.5. Explicit Type Conversion

Sometimes, we have to convert a value of one type into a value of another. Many (arguably too many) such conversions are done implicitly according to the language rules (§2.2.2, §10.5). For example:

double d = 1234567890; // integer to floating-point
int i = d; // floating-point to integer

In other cases, we have to be explicit.

For logical and historical reasons, C++ offers explicit type conversion operations of varying convenience and safety:

• Construction, using the {} notation, providing type-safe construction of new values (§11.5.1)

• Named conversions, providing conversions of various degrees of nastiness:

const_cast for getting write access to something declared const7.5)

static_cast for reversing a well-defined implicit conversion (§11.5.2)

reinterpret_cast for changing the meaning of bit patterns (§11.5.2)

dynamic_cast for dynamically checked class hierarchy navigation (§22.2.1)

• C-style casts, providing any of the named conversions and some combinations of those (§11.5.3)

• Functional notation, providing a different notation for C-style casts (§11.5.4)

I have ordered these conversions in my order of preference and safety of use.

Except for the {} construction notation, I can’t say I like any of those, but at least dynamic_cast is run-time checked. For conversion between two scalar numeric types, I tend to use a homemade explicit conversion function, narrow_cast, where a value might be narrowed:

template<class Target, class Source>
Target narrow_cast(Source v)
{
auto r = static_cast<Target>(v); //
convert the value to the target type
if (static_cast<Source>(r)!=v)
throw runtime_error("narrow_cast<>() failed");
return r;
}

That is, if I can convert a value to the target type, convert the result back to the source type, and get back the original value, I’m happy with the result. That is a generalization of the rule the language applies to values in {} initialization (§6.3.5.2). For example:

void test(double d, int i, char* p)
{
auto c1 = narrow_cast<char>(64);
auto c2 = narrow_cast<char>(–64); //
will throw if chars are unsigned
auto c3 = narrow_cast<char>(264); // will throw if chars are 8-bit and signed

auto d1 = narrow_cast<double>(1/3.0F); // OK
auto f1 = narrow_cast<float>(1/3.0); // will probably throw

auto c4 = narrow_cast<char>(i); // may throw
auto f2 = narrow_cast<float>(d); // may throw

auto p1 = narrow_cast<char* >(i); // compile-time error
auto i1 = narrow_cast<int>(p); // compile-time error

auto d2 = narrow_cast<double>(i); // may throw (but probably will not)
auto i2 = narrow_cast<int>(d); // may throw
}

Depending on your use of floating-point numbers, it may be worthwhile to use a range test for floating-point conversions, rather than !=. That is easily done using specializations (§25.3.4.1) or type traits (§35.4.1).

11.5.1. Construction

The construction of a value of type T from a value e can be expressed by the notation T{e} (§iso.8.5.4). For example:

auto d1 = double{2}; // d1==2.0
double d2 {double{2}/4}; // d1==0.5

Part of the attraction of the T{v} notation is that it will perform only “well-behaved” conversions. For example:

void f(int);
void f(double);

void g(int i, double d)
{
f(i); //
call f(int)
f(double{i}); // error: {} doesn't do int to floating conversion

f(d); // call f(double)
f(int{d}); // error: {} doesn't truncate
f(static_cast<int>(d)); // call f(int) with a truncated value

f(round(d)); // call f(double) with a rounded value
f(static_cast<int>(lround(d))); // call f(int) with a rounded value
// if the d is overflows the int, this still truncates
}

I don’t consider truncation of floating-point numbers (e.g., 7.9 to 7) “well behaved,” so having to be explicit when you want it is a good thing. If rounding is desirable, we can use the standard-library function round(); it performs “conventional 4/5 rounding,” such as 7.9 to 8 and 7.4 to 7.

It sometimes comes as a surprise that {}-construction doesn’t allow int to double conversion, but if (as is not uncommon) the size of an int is the same as the size of a double, then some such conversions must lose information. Consider:

static_assert(sizeof(int)==sizeof(double),"unexpected sizes");

int x = numeric_limits<int>::max(); // largest possible integer
double d = x;
int y = x;

We will not get x==y. However, we can still initialize a double with an integer literal that can be represented exactly. For example:

double d { 1234 }; // fine

Explicit qualification with the desired type does not enable ill-behaved conversions. For example:

void g2(char* p)
{
int x = int{p}; //
error: no char* to int conversion
using Pint = int*;
int*
p2 = Pint{p}; // error: no char* to int* conversion
// ...
}

For T{v}, “reasonably well behaved” is defined as having a “non-narrowing” (§10.5) conversion from v to T or having an appropriate constructor for T17.3).

The constructor notation T{} is used to express the default value of type T. For example:

template<class T> void f(const T&);

void g3()
{
f(int{}); //
default int value
f(complex<double>{}); // default complex value
// ...
}

The value of an explicit use of the constructor for a built-in type is 0 converted to that type (§6.3.5). Thus, int{} is another way of writing 0. For a user-defined type T, T{} is defined by the default constructor (§3.2.1.1, §17.6), if any, otherwise by default construction, MT{}, of each member.

Explicitly constructed unnamed objects are temporary objects, and (unless bound to a reference) their lifetime is limited to the full expression in which they are used (§6.4.2). In this, they differ from unnamed objects created using new11.2).

11.5.2. Named Casts

Some type conversions are not well behaved or easy to type check; they are not simple constructions of values from a well-defined set of argument values. For example:

IO_device* d1 = reinterpret_cast<IO_device* >(0Xff00); // device at 0Xff00

There is no way a compiler can know whether the integer 0Xff00 is a valid address (of an I/O device register). Consequently, the correctness of the conversions is completely in the hands of the programmer. Explicit type conversion, often called casting, is occasionally essential. However, traditionally it is seriously overused and a major source of errors.

Another classical example of the need for explicit type conversion is dealing with “raw memory,” that is, memory that holds or will hold objects of a type not known to the compiler. For example, a memory allocator (such as operator new(); §11.2.3) may return a void* pointing to newly allocated memory:

void* my_allocator(size_t);

void f()
{
int*
p = static_cast<int* >(my_allocator(100)); // new allocation used as ints
// ...
}

A compiler does not know the type of the object pointed to by the void*.

The fundamental idea behind the named casts is to make type conversion more visible and to allow the programmer to express the intent of a cast:

static_cast converts between related types such as one pointer type to another in the same class hierarchy, an integral type to an enumeration, or a floating-point type to an integral type. It also does conversions defined by constructors (§16.2.6, §18.3.3, §iso.5.2.9) and conversion operators (§18.4).

reinterpret_cast handles conversions between unrelated types such as an integer to a pointer or a pointer to an unrelated pointer type (§iso.5.2.10).

const_cast converts between types that differ only in const and volatile qualifiers (§iso.5.2.11).

dynamic_cast does run-time checked conversion of pointers and references into a class hierarchy (§22.2.1, §iso.5.2.7).

These distinctions among the named casts allow the compiler to apply some minimal type checking and make it easier for a programmer to find the more dangerous conversions represented as reinterpret_casts. Some static_casts are portable, but few reinterpret_casts are. Hardly any guarantees are made for reinterpret_cast, but generally it produces a value of a new type that has the same bit pattern as its argument. If the target has at least as many bits as the original value, we can reinterpret_cast the result back to its original type and use it. The result of areinterpret_cast is guaranteed to be usable only if its result is converted back to the exact original type. Note that reinterpret_cast is the kind of conversion that must be used for pointers to functions (§12.5). Consider:

char x = 'a';
int*
p1 = &x; // error: no implicit char* to int* conversion
int* p2 = static_cast<int* >(&x); // error: no implicit char* to int* conversion
int* p3 = reinterpret_cast<int* >(&x); // OK: on your head be it

struct B { /* ... */ };
struct D : B { /*
... */ }; // see §3.2.2 and §20.5.2

B* pb = new D; // OK: implicit conversion from D* to B*
D* pd = pb; // error: no implicit conversion from B* to D*
D* pd = static_cast<D >(pb); // OK

Conversions among class pointers and among class reference types are discussed in §22.2.

If you feel tempted to use an explicit type conversion, take the time to consider if it is really necessary. In C++, explicit type conversion is unnecessary in most cases when C needs it (§1.3.3) and also in many cases in which earlier versions of C++ needed it (§1.3.2, §44.2.3). In many programs, explicit type conversion can be completely avoided; in others, its use can be localized to a few routines.

11.5.3. C-Style Cast

From C, C++ inherited the notation (T)e, which performs any conversion that can be expressed as a combination of static_casts, reinterpret_casts, const_casts to make a value of type T from the expression e44.2.3). Unfortunately, the C-style cast can also cast from a pointer to a class to a pointer to a private base of that class. Never do that, and hope for a warning from the compiler if you do it by mistake. This C-style cast is far more dangerous than the named conversion operators because the notation is harder to spot in a large program and the kind of conversion intended by the programmer is not explicit. That is, (T)e might be doing a portable conversion between related types, a nonportable conversion between unrelated types, or removing the const modifier from a pointer type. Without knowing the exact types of T and e, you cannot tell.

11.5.4. Function-Style Cast

The construction of a value of type T from a value e can be expressed by the functional notation T(e). For example:

void f(double d)
{
int i = int(d); //
truncate d
complex z = complex(d); // make a complex from d
// ...
}

The T(e) construct is sometimes referred to as a function-style cast. Unfortunately, for a built-in type T, T(e) is equivalent to (T)e11.5.3). This implies that for many built-in types T(e) is not safe.

void f(double d, char* p)
{
int a = int(d); //
truncates
int b = int(p); // not portable
// ...
}

Even explicit conversion of a longer integer type to a shorter (such as long to char) can result in nonportable implementation-defined behavior.

Prefer T{v} conversions for well-behaved construction and the named casts (e.g., static_cast) for other conversions.

11.6. Advice

[1] Prefer prefix ++ over suffix ++; §11.1.4.

[2] Use resource handles to avoid leaks, premature deletion, and double deletion; §11.2.1.

[3] Don’t put objects on the free store if you don’t have to; prefer scoped variables; §11.2.1.

[4] Avoid “naked new” and “naked delete”; §11.2.1.

[5] Use RAII; §11.2.1.

[6] Prefer a named function object to a lambda if the operation requires comments; §11.4.2.

[7] Prefer a named function object to a lambda if the operation is generally useful; §11.4.2.

[8] Keep lambdas short; §11.4.2.

[9] For maintainability and correctness, be careful about capture by reference; §11.4.3.1.

[10] Let the compiler deduce the return type of a lambda; §11.4.4.

[11] Use the T{e} notation for construction; §11.5.1.

[12] Avoid explicit type conversion (casts); §11.5.

[13] When explicit type conversion is necessary, prefer a named cast; §11.5.

[14] Consider using a run-time checked cast, such as narrow_cast<>(), for conversion between numeric types; §11.5.