Functions - Basic Facilities - The C++ Programming Language (2013)

The C++ Programming Language (2013)

Part II: Basic Facilities

12. Functions

Death to all fanatics!

– Paradox

Function Declarations

Why Functions?; Parts of a Function Declaration; Function Definitions; Returning Values; inline Functions; constexpr Functions; [[noreturn]] Functions; Local Variables

Argument Passing

Reference Arguments; Array Arguments; List Arguments; Unspecified Number of Arguments; Default Arguments

Overloaded Functions

Automatic Overload Resolution; Overloading and Return Type; Overloading and Scope; Resolution for Multiple Arguments; Manual Overload Resolution

Pre- and Postconditions

Pointer to Function

Macros

Conditional Compilation; Predefined Macros; Pragmas

Advice

12.1. Function Declarations

The main way of getting something done in a C++ program is to call a function to do it. Defining a function is the way you specify how an operation is to be done. A function cannot be called unless it has been previously declared.

A function declaration gives the name of the function, the type of the value returned (if any), and the number and types of the arguments that must be supplied in a call. For example:

Elem* next_elem(); // no argument; return an Elem*
void exit(int); // int argument; return nothing
double sqrt(double); // double argument; return a double

The semantics of argument passing are identical to the semantics of copy initialization (§16.2.6). Argument types are checked and implicit argument type conversion takes place when necessary. For example:

double s2 = sqrt(2); // call sqrt() with the argument double{2}
double s3 = sqrt("three"); // error: sqrt() requires an argument of type double

The value of such checking and type conversion should not be underestimated.

A function declaration may contain argument names. This can be a help to the reader of a program, but unless the declaration is also a function definition, the compiler simply ignores such names. As a return type, void means that the function does not return a value (§6.2.7).

The type of a function consists of the return type and the argument types. For class member functions (§2.3.2, §16.2), the name of the class is also part of the function type. For example:

double f(int i, const Info&); // type: double(int,const Info&)
char& String::operator[](int); // type: char& String::(int)

12.1.1. Why Functions?

There is a long and disreputable tradition of writing very long functions – hundreds of lines long. I once encountered a single (handwritten) function with more than 32,768 lines of code. Writers of such functions seem to fail to appreciate one of the primary purposes of functions: to break up complicated computations into meaningful chunks and name them. We want our code to be comprehensible, because that is the first step on the way to maintainability. The first step to comprehensibility is to break computational tasks into comprehensible chunks (represented as functions and classes) and name those. Such functions then provide the basic vocabulary of computation, just as the types (built-in and user-defined) provide the basic vocabulary of data. The C++ standard algorithms (e.g., find, sort, and iota) provide a good start (Chapter 32). Next, we can compose functions representing common or specialized tasks into larger computations.

The number of errors in code correlates strongly with the amount of code and the complexity of the code. Both problems can be addressed by using more and shorter functions. Using a function to do a specific task often saves us from writing a specific piece of code in the middle of other code; making it a function forces us to name the activity and document its dependencies. Also, function call and return saves us from using error-prone control structures, such as gotos (§9.6) and continues (§9.5.5). Unless they are very regular in structure, nested loops are an avoidable source of errors (e.g., use a dot product to express a matrix algorithm rather than nesting loops; §40.6).

The most basic advice is to keep a function of a size so that you can look at it in total on a screen. Bugs tend to creep in when we can view only part of an algorithm at a time. For many programmers that puts a limit of about 40 lines on a function. My ideal is a much smaller size still, maybe an average of 7 lines.

In essentially all cases, the cost of a function call is not a significant factor. Where that cost could be significant (e.g., for frequently used access functions, such as vector subscripting) inlining can eliminate it (§12.1.5). Use functions as a structuring mechanism.

12.1.2. Parts of a Function Declaration

In addition to specifying a name, a set of arguments, and a return type, a function declaration can contain a variety of specifiers and modifiers. In all we can have:

• The name of the function; required

• The argument list, which may be empty (); required

• The return type, which may be void and which may be prefix or suffix (using auto); required

inline, indicating a desire to have function calls implemented by inlining the function body (§12.1.5)

constexpr, indicating that it should be possible to evaluate the function at compile time if given constant expressions as arguments (§12.1.6)

noexcept, indicating that the function may not throw an exception (§13.5.1.1)

• A linkage specification, for example, static15.2)

[[noreturn]], indicating that the function will not return using the normal call/return mechanism (§12.1.4)

In addition, a member function may be specified as:

virtual, indicating that it can be overridden in a derived class (§20.3.2)

override, indicating that it must be overriding a virtual function from a base class (§20.3.4.1)

final, indicating that it cannot be overriden in a derived class (§20.3.4.2)

static, indicating that it is not associated with a particular object (§16.2.12)

const, indicating that it may not modify its object (§3.2.1.1, §16.2.9.1)

If you feel inclined to give readers a headache, you may write something like:

struct S {
[[noreturn]] virtual inline auto f(const unsigned long int *const) –> void const noexcept;
};

12.1.3. Function Definitions

Every function that is called must be defined somewhere (once only; §15.2.3). A function definition is a function declaration in which the body of the function is presented. For example:

void swap(int*, int*); // a declaration

void swap(int* p, int* q) // a definition
{
int t = *p;
*p = *q;
*q = t;
}

The definition and all declarations for a function must specify the same type. Unfortunately, to preserve C compatibility, a const is ignored at the highest level of an argument type. For example, this is two declarations of the same function:

void f(int); // type is void(int)
void f(const int); // type is void(int)

That function, f(), could be defined as:

void f(int x) { /*we can modify x here */ }

Alternatively, we could define f() as:

void f(const int x) { /*we cannot modify x here */ }

In either case, the argument that f() can or cannot modify is a copy of what a caller provided, so there is no danger of an obscure modification of the calling context.

Function argument names are not part of the function type and need not be identical in different declarations. For example:

int& max(int& a, int& b, int& c); // return a reference to the larger of a, b, and c

int& max(int& x1, int& x2, int& x3)
{
return (x1>x2)? ((x1>x3)?x1:x3) : ((x2>x3)?x2:x3);
}

Naming arguments in declarations that are not definitions is optional and commonly used to simplify documentation. Conversely, we can indicate that an argument is unused in a function definition by not naming it. For example:

void search(table* t, const char* key, const char*)
{
//
no use of the third argument
}

Typically, unnamed arguments arise from the simplification of code or from planning ahead for extensions. In both cases, leaving the argument in place, although unused, ensures that callers are not affected by the change.

In addition to functions, there are a few other things that we can call; these follow most rules defined for functions, such as the rules for argument passing (§12.2):

Constructors2.3.2, §16.2.5) are technicallly not functions; in particular, they don’t return a value, can initialize bases and members (§17.4), and can’t have their address taken.

Destructors3.2.1.2, §17.2) can’t be overloaded and can’t have their address taken.

Function objects3.4.3, §19.2.2) are not functions (they are objects) and can’t be overloaded, but their operator()s are functions.

Lambda expressions3.4.3, §11.4) are basically a shorthand for defining function objects.

12.1.4. Returning Values

Every function declaration contains a specification of the function’s return type (except for constructors and type conversion functions). Traditionally, in C and C++, the return type comes first in a function declaration (before the name of the function). However, a function declaration can also be written using a syntax that places the return type after the argument list. For example, the following two declarations are equivalent:

string to_string(int a); // prefix return type
auto to_string(int a) –> string; // suffix return type

That is, a prefix auto indicates that the return type is placed after the argument list. The suffix return type is preceded by –>.

The essential use for a suffix return type comes in function template declarations in which the return type depends on the arguments. For example:

template<class T, class U>
auto product(const vector<T>& x, const vector<U>& y) –> decltype(x*y);

However, the suffix return syntax can be used for any function. There is an obvious similarity between the suffix return syntax for a function and the lambda expression syntax (§3.4.3, §11.4); it is a pity those two constructs are not identical.

A function that does not return a value has a “return type” of void.

A value must be returned from a function that is not declared void (however, main() is special; see §2.2.1). Conversely, a value cannot be returned from a void function. For example:

int f1() { } // error: no value returned
void f2() { } // OK

int f3() { return 1; } // OK
void f4() { return 1; } // error: return value in void function

int f5() { return; } // error: return value missing
void f6() { return; } // OK

A return value is specified by a return-statement. For example:

int fac(int n)
{
return (n>1) ? n*fac(n–1) : 1;
}

A function that calls itself is said to be recursive.

There can be more than one return-statement in a function:

int fac2(int n)
{
if (n > 1)
return n*fac2(n–1);
return 1;
}

Like the semantics of argument passing, the semantics of function value return are identical to the semantics of copy initialization (§16.2.6). A return-statement initializes a variable of the returned type. The type of a return expression is checked against the type of the returned type, and all standard and user-defined type conversions are performed. For example:

double f() { return 1; } // 1 is implicitly converted to double{1}

Each time a function is called, a new copy of its arguments and local (automatic) variables is created. The store is reused after the function returns, so a pointer to a local non-static variable should never be returned. The contents of the location pointed to will change unpredictably:

int* fp()
{
int local = 1;
// ...

return &local; // bad
}

An equivalent error can occur when using references:

int& fr()
{
int local = 1;
// ...

return local; // bad
}

Fortunately, a compiler can easily warn about returning references to local variables (and most do).

There are no void values. However, a call of a void function may be used as the return value of a void function. For example:

void g(int* p);

void h(int* p)
{
// ...

return g(p); // OK: equivalent to "g(p); return;"
}

This form of return is useful to avoid special cases when writing template functions where the return type is a template parameter.

A return-statement is one of five ways of exiting a function:

• Executing a return-statement.

• “Falling off the end” of a function; that is, simply reaching the end of the function body. This is allowed only in functions that are not declared to return a value (i.e., void functions) and in main(), where falling off the end indicates successful completion (§12.1.4).

• Throwing an exception that isn’t caught locally (§13.5).

• Terminating because an exception was thrown and not caught locally in a noexcept function (§13.5.1.1).

• Directly or indirectly invoking a system function that doesn’t return (e.g., exit(); §15.4).

A function that does not return normally (i.e., through a return or “falling off the end”) can be marked [[noreturn]]12.1.7).

12.1.5. inline Functions

A function can be defined to be inline. For example:

inline int fac(int n)
{
return (n<2) ? 1 : n*fac(n–1);
}

The inline specifier is a hint to the compiler that it should attempt to generate code for a call of fac() inline rather than laying down the code for the function once and then calling through the usual function call mechanism. A clever compiler can generate the constant 720 for a call fac(6). The possibility of mutually recursive inline functions, inline functions that recurse or not depending on input, etc., makes it impossible to guarantee that every call of an inline function is actually inlined. The degree of cleverness of a compiler cannot be legislated, so one compiler might generate 720, another 6*fac(5), and yet another an un-inlined call fac(6). If you want a guarantee that a value is computed at compile time, declare it constexpr and make sure that all functions used in its evaluation are constexpr12.1.6).

To make inlining possible in the absence of unusually clever compilation and linking facilities, the definition – and not just the declaration – of an inline function must be in scope (§15.2). An inline specifier does not affect the semantics of a function. In particular, an inline function still has a unique address, and so do static variables (§12.1.8) of an inline function.

If an inline function is defined in more than one translation unit (e.g., typically because it was defined in a header; §15.2.2), its definition in the different translation units must be identical (§15.2.3).

12.1.6. constexpr Functions

In general, a function cannot be evaluated at compile time and therefore cannot be called in a constant expression (§2.2.3, §10.4). By specifying a function constexpr, we indicate that we want it to be usable in constant expressions if given constant expressions as arguments. For example:

constexpr int fac(int n)
{
return (n>1) ? n*fac(n–1) : 1;
}


constexpr int f9 = fac(9); // must be evaluated at compile time

When constexpr is used in a function definition, it means “should be usable in a constant expression when given constant expressions as arguments.” When used in an object definition, it means “evaluate the initializer at compile time.” For example:

void f(int n)
{
int f5 = fac(5); //
may be evaluated at compile time
int fn = fac(n); // evaluated at run time (n is a variable)

constexpr int f6 = fac(6); // must be evaluated at compile time
constexpr int fnn = fac(n); // error: can't guarantee compile-time evaluation (n is a variable)

char a[fac(4)]; // OK: array bounds must be constants and fac() is constexpr
char a2[fac(n)]; // error: array bounds must be constants and n is a variable

// ...
}

To be evaluated at compile time, a function must be suitably simple: a constexpr function must consist of a single return-statement; no loops and no local variables are allowed. Also, a constexpr function may not have side effects. That is, a constexpr function is a pure function. For example:

int glob;

constexpr void bad1(int a) //
error: constexpr function cannot be void
{
glob = a; //
error: side effect in constexpr function
}

constexpr int bad2(int a)
{
if (a>=0) return a; else return –a; //
error: if-statement in constexpr function
}

constexpr int bad3(int a)
{
sum = 0; //
error: local variable in constexpr function
for (int i=0; i<a; +=i) sum +=fac(i); // error: loop in constexpr function
return sum;
}

The rules for a constexpr constructor are suitably different (§10.4.3); there, only simple initialization of members is allowed.

A constexpr function allows recursion and conditional expressions. This implies that you can express just about anything as a constexpr function if you really want to. However, you’ll find the debugging gets unnecessarily difficult and compile times longer than you would like unless you restrict the use of constexpr functions to the relatively simple tasks for which they are intended.

By using literal types (§10.4.3), constexpr functions can be defined to use user-defined types.

Like inline functions, constexpr functions obey the ODR (“one-definition rule”), so that definitions in the different translation units must be identical (§15.2.3). You can think of constexpr functions as a restricted form of inline functions (§12.1.5).

12.1.6.1. constexpr and References

A constexpr function cannot have side effects, so writing to nonlocal objects is not possible. However, a constexpr function can refer to nonlocal objects as long as it does not write to them.

constexpr int ftbl[] { 1, 2, 3, 5, 8, 13 };

constexpr int fib(int n)
{
return (n<sizeof(ftbl)/sizeof(*ftbl)) ? ftbl[n] : fib(n);
}

A constexpr function can take reference arguments. Of course, it cannot write through such references, but const reference parameters are as useful as ever. For example, in the standard library (§40.4) we find:

template<> class complex<float> {
public:
// ...

explicit constexpr complex(const complex<double>&);
// ...

};

This allows us to write:

constexpr complex<float> z {2.0};

The temporary variable that is logically constructed to hold the const reference argument simply becomes a value internal to the compiler.

It is possible for a constexpr function to return a reference or a pointer. For example:

constexpr const int* addr(const int& r) { return &r; } // OK

However, doing so brings us away from the fundamental role of constexpr functions as parts of constant expression evaluation. In particular, it can be quite tricky to determine whether the result of such a function is a constant expression. Consider:

static const int x = 5;
constexpr const int* p1 = addr(x); //
OK
constexpr int xx = *p1; // OK

static int y;
constexpr const int* p2 = addr(y); //
OK
constexpr int yy = *y; // error: attempt to read a variable

constexpr const int* tp = addr(5); // error: address of temporary

12.1.6.2. Conditional Evaluation

A branch of a conditional expression that is not taken in a constexpr function is not evaluated. This implies that a branch not taken can require run-time evaluation. For example:

constexpr int check(int i)
{
return (low<=i && i<high) ? i : throw out_of_range();
}

constexpr int low = 0;
constexpr int high = 99;

// ...

constexpr int val = check(f(x,y,z));

You might imagine low and high to be configuration parameters that are known at compile time, but not at design time, and that f(x,y,z) computes some implementation-dependent value.

12.1.7. [[noreturn]] Functions

A construct [[...]] is called an attribute and can be placed just about anywhere in the C++ syntax. In general, an attribute specifies some implementation-dependent property about the syntactic entity that precedes it. In addition, an attribute can be placed in front of a declaration. There are only two standard attributes (§iso.7.6), and [[noreturn]] is one of them. The other is [[carries_dependency]]41.3).

Placing [[noreturn]] at the start of a function declaration indicates that the function is not expected to return. For example:

[[noreturn]] void exit(int); // exit will never return

Knowing that a function does not return is useful for both comprehension and code generation. What happens if the function returns despite a [[noreturn]] attribute is undefined.

12.1.8. Local Variables

A name defined in a function is commonly referred to as a local name. A local variable or constant is initialized when a thread of execution reaches its definition. Unless declared static, each invocation of the function has its own copy of the variable. If a local variable is declared static, a single, statically allocated object (§6.4.2) will be used to represent that variable in all calls of the function. It will be initialized only the first time a thread of execution reaches its definition. For example:

void f(int a)
{
while (a––) {
static int n = 0; //
initialized once
int x = 0; // initialized 'a' times in each call of f()

cout << "n == " << n++ << ", x == " << x++ << '\n';
}
}

int main()
{
f(3);
}

This prints:

n == 0, x == 0
n == 1, x == 0
n == 2, x == 0

A static local variable allows the function to preserve information between calls without introducing a global variable that might be accessed and corrupted by other functions (see also §16.2.12).

Initialization of a static local variable does not lead to a data race (§5.3.1) unless you enter the function containing it recursively or a deadlock occurs (§iso.6.7). That is, the C++ implementation must guard the initialization of a local static variable with some kind of lock-free construct (e.g., a call_once; §42.3.3). The effect of initializing a local static recursively is undefined. For example:

int fn(int n)
{
static int n1 = n; //
OK
static int n2 = fn(n–1)+1; // undefined
return n;
}

A static local variable is useful for avoiding order dependencies among nonlocal variables (§15.4.1).

There are no local functions; if you feel you need one, use a function object or a lambda expression (§3.4.3, §11.4).

The scope of a label (§9.6), should you be foolhardy enough to use one, is the complete function, independent of which nested scope it may be in.

12.2. Argument Passing

When a function is called (using the suffix (), known as the call operator or application operator), store is set aside for its formal arguments (also known as its parameters), and each formal argument is initialized by its corresponding actual argument. The semantics of argument passing are identical to the semantics of initialization (copy initialization, to be precise; §16.2.6). In particular, the type of an actual argument is checked against the type of the corresponding formal argument, and all standard and user-defined type conversions are performed. Unless a formal argument (parameter) is a reference, a copy of the actual argument is passed to the function. For example:

int* find(int* first, int* last, int v) // find x in [first:last)
{
while (first!=last && *first!=v)
++first;
return first;
}

void g(int* p, int* q)
{
int* pp = find(p,q,'x');
// ...

}

Here, the caller’s copy of the argument, p, is not modified by the operations on find()’s copy, called first. The pointer is passed by value.

There are special rules for passing arrays (§12.2.2), a facility for passing unchecked arguments (§12.2.4), and a facility for specifying default arguments (§12.2.5). The use of initializer lists is described in §12.2.3 and the ways of passing arguments to template functions in §23.5.2 and §28.6.2.

12.2.1. Reference Arguments

Consider:

void f(int val, int& ref)
{
++val;
++ref;
}

When f() is called, ++val increments a local copy of the first actual argument, whereas ++ref increments the second actual argument. Consider:

void g()
{
int i = 1;
int j = 1;
f(i,j);
}

The call f(i,j) will increment j but not i. The first argument, i, is passed by value; the second argument, j, is passed by reference. As mentioned in §7.7, functions that modify call-by-reference arguments can make programs hard to read and should most often be avoided (but see §18.2.5). It can, however, be noticeably more efficient to pass a large object by reference than to pass it by value. In that case, the argument might be declared a const reference to indicate that the reference is used for efficiency reasons only and not to enable the called function to change the value of the object:

void f(const Large& arg)
{
//
the value of "arg" cannot be changed
// (except by using explicit type conversion; §11.5)
}

The absence of const in the declaration of a reference argument is taken as a statement of intent to modify the variable:

void g(Large& arg); // assume that g() modifies arg

Similarly, declaring a pointer argument const tells readers that the value of an object pointed to by that argument is not changed by the function. For example:

int strlen(const char*); // number of characters in a C-style string
char* strcpy(char* to, const char* from); // copy a C-style string
int strcmp(const char*, const char*); // compare C-style strings

The importance of using const arguments increases with the size of a program.

Note that the semantics of argument passing are different from the semantics of assignment. This is important for const arguments, reference arguments, and arguments of some user-defined types.

Following the rules for reference initialization, a literal, a constant, and an argument that requires conversion can be passed as a const T& argument, but not as a plain (non-const)T& argument. Allowing conversions for a const T& argument ensures that such an argument can be givenexactly the same set of values as a T argument by passing the value in a temporary, if necessary. For example:

float fsqrt(const float&); // Fortran-style sqrt taking a reference argument

void g(double d)
{
float r = fsqrt(2.0f); //
pass reference to temp holding 2.0f
r = fsqrt(r); // pass reference to r
r = fsqrt(d); // pass reference to temp holding static_cast<float>(d)
}

Disallowing conversions for non-const reference arguments (§7.7) avoids the possibility of silly mistakes arising from the introduction of temporaries. For example:

void update(float& i);

void g(double d, float r)
{
update(2.0f); //
error: const argument
update(r); // pass reference to r
update(d); // error: type conversion required
}

Had these calls been allowed, update() would quietly have updated temporaries that immediately were deleted. Usually, that would come as an unpleasant surprise to the programmer.

If we wanted to be precise, pass-by-reference would be pass-by-lvalue-reference because a function can also take rvalue references. As described in §7.7, an rvalue can be bound to an rvalue reference (but not to an lvalue reference) and an lvalue can be bound to an lvalue reference (but not to an rvalue reference). For example:

void f(vector<int>&); // (non-const) lvalue reference argument
void f(const vector<int>&); // const lvalue reference argument
void f(vector<int>&&); // rvalue reference argument

void g(vector<int>& vi, const vector<int>& cvi)
{
f(vi); //
call f(vector<int>&)
f(vci); // call f(const vector<int>&)
f(vector<int>{1,2,3,4}); // call f(vector<int>&&);
}

We must assume that a function will modify an rvalue argument, leaving it good only for destruction or reassignment (§17.5). The most obvious use of rvalue references is to define move constructors and move assignments (§3.3.2, §17.5.2). I’m sure someone will find a clever use for const-rvalue-reference arguments, but so far, I have not seen a genuine use case.

Please note that for a template argument T, the template argument type deduction rules give T&& a significantly different meaning from X&& for a type X23.5.2.1). For template arguments, an rvalue reference is most often used to implement “perfect forwarding” (§23.5.2.1, §28.6.3).

How do we choose among the ways of passing arguments? My rules of thumb are:

[1] Use pass-by-value for small objects.

[2] Use pass-by-const-reference to pass large values that you don’t need to modify.

[3] Return a result as a return value rather than modifying an object through an argument.

[4] Use rvalue references to implement move (§3.3.2, §17.5.2) and forwarding (§23.5.2.1).

[5] Pass a pointer if “no object” is a valid alternative (and represent “no object” by nullptr).

[6] Use pass-by-reference only if you have to.

The “when you have to” in the last rule of thumb refers to the observation that passing pointers is often a less obscure mechanism for dealing with objects that need modification (§7.7.1, §7.7.4) than using references.

12.2.2. Array Arguments

If an array is used as a function argument, a pointer to its initial element is passed. For example:

int strlen(const char*);

void f()
{
char v[] = "Annemarie";
int i = strlen(v);
int j = strlen("Nicholas");
}

That is, an argument of type T[] will be converted to a T* when passed as an argument. This implies that an assignment to an element of an array argument changes the value of an element of the argument array. In other words, arrays differ from other types in that an array is not passed by value. Instead, a pointer is passed (by value).

A parameter of array type is equivalent to a parameter of pointer type. For example:

void odd(int* p);
void odd(int a[]);
void odd(int buf[1020]);

These three declarations are equivalent and declare the same function. As usual, the argument names do not affect the type of the function (§12.1.3). The rules and techniques for passing multidimensional arrays can be found in §7.4.3.

The size of an array is not available to the called function. This is a major source of errors, but there are several ways of circumventing this problem. C-style strings are zero-terminated, so their size can be computed (e.g., by a potentially expensive call of strlen(); §43.4). For other arrays, a second argument specifying the size can be passed. For example:

void compute1(int* vec_ptr, int vec_size); // one way

At best, this is a workaround. It is usually preferable to pass a reference to some container, such as vector4.4.1, §31.4), array34.2.1), or map4.4.3, §31.4.3).

If you really want to pass an array, rather than a container or a pointer to the first element of an array, you can declare a parameter of type reference to array. For example:

void f(int(&r)[4]);

void g()
{
int a1[] = {1,2,3,4};
int a2[] = {1,2};

f(a1); //
OK
f(a2); // error : wrong number of elements
}

Note that the number of elements is part of a reference-to-array type. That makes such references far less flexible than pointers and containers (such as vector). The main use of references to arrays is in templates, where the number of elements is then deduced. For example:

template<class T, int N> void f(T(&r)[N])
{
// ...

}

int a1[10];
double a2[100];

void g()
{
f(a1); //
T is int; N is 10
f(a2); // T is double; N is 100
}

This typically gives rise to as many function definitions as there are calls to f() with distinct array types.

Multidimensional arrays are tricky (see §7.3), but often arrays of pointers can be used instead, and they need no special treatment. For example:

const char* day[] = {
"mon", "tue", "wed", "thu", "fri", "sat", "sun"
};

As ever, vector and similar types are alternatives to the built-in, low-level arrays and pointers.

12.2.3. List Arguments

A {}-delimited list can be used as an argument to a parameter of:

[1] Type std::initializer_list<T>, where the values of the list can be implicitly converted to T

[2] A type that can be initialized with the values provided in the list

[3] A reference to an array of T, where the values of the list can be implicitly converted to T

Technically, case [2] covers all examples, but I find it easier to think of the three cases separately. Consider:

template<class T>
void f1(initializer_list<T>);

struct S {
int a;
string s;
};
void f2(S);

template<class T, int N>
void f3(T (&r)[N]);

void f4(int);

void g()
{
f1({1,2,3,4}); //
T is int and the initializer_list has size() 4
f2({1,"MKS"}); // f2(S{1,"MKS"})
f3({1,2,3,4}); // T is int and N is 4
f4({1}); // f4(int{1});
}

If there is a possible ambiguity, an initializer_list parameter takes priority. For example:

template<class T>
void f(initializer_list<T>);

struct S {
int a;
string s;

};
void f(S);

template<class T, int N>
void f(T (&r)[N]);

void f(int);

void g()
{
f({1,2,3,4}); //
T is int and the initializer_list has size() 4
f({1,"MKS"}); // calls f(S)
f({1}); // T is int and the initializer_list has size() 1
}

The reason that a function with an initializer_list argument take priority is that it could be very confusing if different functions were chosen based on the number of elements of a list. It is not possible to eliminate every form of confusion in overload resolution (for example, see §4.4, §17.3.4.1), but giving initializer_list parameters priority for {}-list arguments seems to minimize confusion.

If there is a function with an initializer-list argument in scope, but the argument list isn’t a match for that, another function can be chosen. The call f({1,"MKS"}) was an example of that.

Note that these rules apply to std::initializer_list<T> arguments only. There are no special rules for std::initializer_list<T>& or for other types that just happen to be called initializer_list (in some other scope).

12.2.4. Unspecified Number of Arguments

For some functions, it is not possible to specify the number and type of all arguments expected in a call. To implement such interfaces, we have three choices:

[1] Use a variadic template (§28.6): this allows us to handle an arbitrary number of arbitrary types in a type-safe manner by writing a small template metaprogram that interprets the argument list to determine its meaning and take appropriate actions.

[2] Use an initializer_list as the argument type (§12.2.3). This allows us to handle an arbitrary number of arguments of a single type in a type-safe manner. In many contexts, such homogeneous lists are the most common and important case.

[3] Terminate the argument list with the ellipsis (...), which means “and maybe some more arguments.” This allows us to handle an arbitrary number of (almost) arbitrary types by using some macros from <cstdarg>. This solution is not inherently type-safe and can be hard to use with sophisticated user-defined types. However, this mechanism has been used from the earliest days of C.

The first two mechanisms are described elsewhere, so I describe only the third mechanism (even though I consider it inferior to the others for most uses). For example:

int printf(const char* ...);

This specifies that a call of the standard-library function printf()43.3) must have at least one argument, a C-style string, but may or may not have others. For example:

printf("Hello, world!\n");
printf("My name is %s %s\n", first_name, second_name);
printf("%d + %d = %d\n",2,3,5);

Such a function must rely on information not available to the compiler when interpreting its argument list. In the case of printf(), the first argument is a format string containing special character sequences that allow printf() to handle other arguments correctly; %s means “expect a char*argument” and %d means “expect an int argument.” However, the compiler cannot in general ensure that the expected arguments are really provided in a call or that an argument is of the expected type. For example:

#include <cstdio>

int main()
{
std::printf("My name is %s %s\n",2);
}

This is not valid code, but most compilers will not catch this error. At best, it will produce some strange-looking output (try it!).

Clearly, if an argument has not been declared, the compiler does not have the information needed to perform the standard type checking and type conversion for it. In that case, a char or a short is passed as an int and a float is passed as a double. This is not necessarily what the programmer expects.

A well-designed program needs at most a few functions for which the argument types are not completely specified. Overloaded functions, functions using default arguments, functions taking initializer_list arguments, and variadic templates can be used to take care of type checking in most cases when one would otherwise consider leaving argument types unspecified. Only when both the number of arguments and the types of arguments vary and a variadic template solution is deemed undesirable is the ellipsis necessary.

The most common use of the ellipsis is to specify an interface to C library functions that were defined before C++ provided alternatives:

int fprintf(FILE*, const char* ...); // from <cstdio>
int execl(const char* ...); // from UNIX header

A standard set of macros for accessing the unspecified arguments in such functions can be found in <cstdarg>. Consider writing an error function that takes one integer argument indicating the severity of the error followed by an arbitrary number of strings. The idea is to compose the error message by passing each word as a separate C-style string argument. The list of string arguments should be terminated by the null pointer:

extern void error(int ...);
extern char* itoa(int, char[]); //
int to alpha

int main(int argc, char* argv[])
{
switch (argc) {
case 1:
error(0,argv[0],nullptr);
break;
case 2:
error(0,argv[0],argv[1],nullptr);
break;
default:
char buffer[8];
error(1,argv[0],"with",itoa(argc–1,buffer),"arguments",nullptr);
}
// ...

}

The function itoa() returns a C-style string representing its int argument. It is popular in C, but not part of the C standard.

I always pass argv[0] because that, conventionally, is the name of the program.

Note that using the integer 0 as the terminator would not have been portable: on some implementations, the integer 0 and the null pointer do not have the same representation (§6.2.8). This illustrates the subtleties and extra work that face the programmer once type checking has been suppressed using the ellipsis.

The error() function could be defined like this:

#include <cstdarg>

void error(int severity ...) // "severity" followed by a zero-terminated list of char*s
{
va_list ap;
va_start(ap,severity); //
arg startup

for (;;) {
char* p = va_arg(ap,char*);
if (p == nullptr) break;
cerr << p << ' ';
}

va_end(ap); //
arg cleanup

cerr << '\n';
if (severity) exit(severity);
}

First, a va_list is defined and initialized by a call of va_start(). The macro va_start takes the name of the va_list and the name of the last formal argument as arguments. The macro va_arg() is used to pick the unnamed arguments in order. In each call, the programmer must supply a type;va_arg() assumes that an actual argument of that type has been passed, but it typically has no way of ensuring that. Before returning from a function in which va_start() has been used, va_end() must be called. The reason is that va_start() may modify the stack in such a way that a return cannot successfully be done; va_end() undoes any such modifications.

Alternatively, error() could have been defined using a standard-library initializer_list:

void error(int severity, initializer_list<string> err)
{
for (auto& s : err)
cerr << s << ' ';
cerr << '\n';
if (severity) exit(severity);
}

It would then have to be called using the list notation. For example:

switch (argc) {
case 1:
error(0,{argv[0]});
break;
case 2:
error(0,{argv[0],argv[1]});
break;
default:
error(1,{argv[0],"with",to_string(argc–1),"arguments"});
}

The int-to-string conversion function to_string() is provided by the standard library (§36.3.5).

If I didn’t have to mimic C style, I would further simplify the code by passing a container as a single argument:

void error(int severity, const vector<string>& err) // almost as before
{
for (auto& s : err)
cerr << s << ' ';
cerr << '\n';
if (severity) exit(severity);
}

vector<string> arguments(int argc, char* argv[]) //
package arguments
{
vector<string> res;
for (int i = 0; i!=argc; ++i)
res.push_back(argv[i]);
return res
}

int main(int argc, char* argv[])
{
auto args = arguments(argc,argv);
error((args.size()<2)?0:1,args);
// ...

}

The helper function, arguments(), is trivial, and main() and error() are simple. The interface between main() and error() is more general in that it now passes all arguments. That would allow later improvements of error(). The use of the vector<string> is far less error-prone than any use of an unspecified number of arguments.

12.2.5. Default Arguments

A general function often needs more arguments than are necessary to handle simple cases. In particular, functions that construct objects (§16.2.5) often provide several options for flexibility. Consider class complex from §3.2.1.1:

class complex {
double re, im;
public:
complex(double r, double i) :re{r}, im{i} {} //
construct complex from two scalars
complex(double r) :re{r}, im{0} {} // construct complex from one scalar
complex() :re{0}, im{0} {}
//
default complex: {0,0}
// ...
};

The actions of complex’s constructors are quite trivial, but logically there is something odd about having three functions (here, constructors) doing essentially the same task. Also, for many classes, constructors do more work and the repetitiveness is common. We could deal with the repetitiveness by considering one of the constructors “the real one” and forward to that (§17.4.3):

complex(double r, double i) :re{r}, im{i} {} // construct complex from two scalars
complex(double r) :complex{2,0} {} // construct complex from one scalar
complex() :complex{0,0} {} // default complex: {0,0}

Say we wanted to add some debugging, tracing, or statistics-gathering code to complex; we now have a single place to do so. However, this can be abbreviated further:

complex(double r ={}, double i ={}) :re{r}, im{i} {} // construct complex from two scalars

This makes it clear that if a user supplies fewer than the two arguments needed, the default is used. The intent of having a single constructor plus some shorthand notation is now explicit.

A default argument is type checked at the time of the function declaration and evaluated at the time of the call. For example:

class X {
public:
static int def_arg;
void f(int =def_arg);
// ...

};

int X::def_arg = 7;

void g(X& a)
{
a.f(); //
maybe f(7)
a.def_arg = 9;
a.f(); //
f(9)
}

Default arguments that can change value are most often best avoided because they introduce subtle context dependencies.

Default arguments may be provided for trailing arguments only. For example:

int f(int, int =0, char* =nullptr);// OK
int g(int =0, int =0, char*); // error
int h(int =0, int, char* =nullptr); // error

Note that the space between the * and the = is significant (*= is an assignment operator; §10.3):

int nasty(char*=nullptr); // syntax error

A default argument cannot be repeated or changed in a subsequent declaration in the same scope. For example:

void f(int x = 7);
void f(int = 7); //
error: cannot repeat default argument
void f(int = 8); // error: different default arguments

void g()
{
void f(int x = 9); //
OK: this declaration hides the outer one
// ...
}

Declaring a name in a nested scope so that the name hides a declaration of the same name in an outer scope is error-prone.

12.3. Overloaded Functions

Most often, it is a good idea to give different functions different names, but when different functions conceptually perform the same task on objects of different types, it can be more convenient to give them the same name. Using the same name for operations on different types is calledoverloading. The technique is already used for the basic operations in C++. That is, there is only one name for addition, +, yet it can be used to add values of integer and floating-point types and combinations of such types. This idea is easily extended to functions defined by the programmer. For example:

void print(int); // print an int
void print(const char*); // print a C-style string

As far as the compiler is concerned, the only thing functions of the same name have in common is that name. Presumably, the functions are in some sense similar, but the language does not constrain or aid the programmer. Thus, overloaded function names are primarily a notational convenience. This convenience is significant for functions with conventional names such as sqrt, print, and open. When a name is semantically significant, this convenience becomes essential. This happens, for example, with operators such as +, *, and <<, in the case of constructors (§16.2.5, §17.1), and in generic programming (§4.5, Chapter 32).

Templates provide a systematic way of defining sets of overloaded functions (§23.5).

12.3.1. Automatic Overload Resolution

When a function fct is called, the compiler must determine which of the functions named fct to invoke. This is done by comparing the types of the actual arguments with the types of the parameters of all functions in scope called fct. The idea is to invoke the function that is the best match to the arguments and give a compile-time error if no function is the best match. For example:

void print(double);
void print(long);

void f()
{
print(1L); //
print(long)
print(1.0); // print(double)
print(1); // error, ambiguous: print(long(1)) or print(double(1))?
}

To approximate our notions of what is reasonable, a series of criteria are tried in order:

[1] Exact match; that is, match using no or only trivial conversions (for example, array name to pointer, function name to pointer to function, and T to const T)

[2] Match using promotions; that is, integral promotions (bool to int, char to int, short to int, and their unsigned counterparts; §10.5.1) and float to double

[3] Match using standard conversions (e.g., int to double, double to int, double to long double, Derived* to Base*20.2), T* to void*7.2.1), int to unsigned int10.5))

[4] Match using user-defined conversions (e.g., double to complex<double>; §18.4)

[5] Match using the ellipsis ... in a function declaration (§12.2.4)

If two matches are found at the highest level where a match is found, the call is rejected as ambiguous. The resolution rules are this elaborate primarily to take into account the elaborate C and C++ rules for built-in numeric types (§10.5). For example:

void print(int);
void print(const char*);
void print(double);
void print(long);
void print(char);

void h(char c, int i, short s, float f)
{
print(c); //
exact match: invoke print(char)
print(i); // exact match: invoke print(int)
print(s); // integral promotion: invoke print(int)
print(f); // float to double promotion: print(double)

print('a'); // exact match: invoke print(char)
print(49); // exact match: invoke print(int)
print(0); // exact match: invoke print(int)
print("a"); // exact match: invoke print(const char*)
print(nullptr); // nullptr_t to const char* promotion: invoke print(cost char*)
}

The call print(0) invokes print(int) because 0 is an int. The call print('a') invokes print(char) because 'a' is a char6.2.3.2). The reason to distinguish between conversions and promotions is that we want to prefer safe promotions, such as char to int, over unsafe conversions, such as intto char. See also §12.3.5.

Overload resolution is independent of the order of declaration of the functions considered.

Function templates are handled by applying the overload resolution rules to the result of specialization based on a set of arguments (§23.5.3). There are separate rules for overloading when a {}-list is used (initializer lists take priority; §12.2.3, §17.3.4.1) and for rvalue reference template arguments (§23.5.2.1).

Overloading relies on a relatively complicated set of rules, and occasionally a programmer will be surprised which function is called. So, why bother? Consider the alternative to overloading. Often, we need similar operations performed on objects of several types. Without overloading, we must define several functions with different names:

void print_int(int);
void print_char(char);
void print_string(const char*); //
C-style string

void g(int i, char c, const char* p, double d)
{
print_int(i); //
OK
print_char(c); // OK
print_string(p); // OK

print_int(c); // OK? calls print_int(int(c)), prints a number
print_char(i); // OK? calls print_char(char(i)), narrowing
print_string(i); // error
print_int(d); // OK? calls print_int(int(d)), narrowing
}

Compared to the overloaded print(), we have to remember several names and remember to use those correctly. This can be tedious, defeats attempts to do generic programming (§4.5), and generally encourages the programmer to focus on relatively low-level type issues. Because there is no overloading, all standard conversions apply to arguments to these functions. It can also lead to errors. In the previous example, this implies that only one of the four calls with doubtful semantics is caught by the compiler. In particular, two calls rely on error-prone narrowing (§2.2.2, §10.5). Thus, overloading can increase the chances that an unsuitable argument will be rejected by the compiler.

12.3.2. Overloading and Return Type

Return types are not considered in overload resolution. The reason is to keep resolution for an individual operator (§18.2.1, §18.2.5) or function call context-independent. Consider:

float sqrt(float);
double sqrt(double);

void f(double da, float fla)
{
float fl = sqrt(da); //
call sqrt(double)
double d = sqrt(da); // call sqrt(double)
fl = sqrt(fla); // call sqrt(float)
d = sqrt(fla); // call sqrt(float)
}

If the return type were taken into account, it would no longer be possible to look at a call of sqrt() in isolation and determine which function was called.

12.3.3. Overloading and Scope

Overloading takes place among the members of an overload set. By default, that means the functions of a single scope; functions declared in different non-namespace scopes do not overload. For example:

void f(int);

void g()
{
void f(double);
f(1); //
call f(double)
}

Clearly, f(int) would have been the best match for f(1), but only f(double) is in scope. In such cases, local declarations can be added or subtracted to get the desired behavior. As always, intentional hiding can be a useful technique, but unintentional hiding is a source of surprises.

A base class and a derived class provide different scopes so that overloading between a base class function and a derived class function doesn’t happen by default. For example:

struct Base {
void f(int);
};

struct Derived : Base {
void f(double);
};

void g(Derived& d)
{
d.f(1); //
call Derived::f(double);
}

When overloading across class scopes (§20.3.5) or namespace scopes (§14.4.5) is wanted, using-declarations or using-directives can be used (§14.2.2). Argument-dependent lookup (§14.2.4) can also lead to overloading across namespaces.

12.3.4. Resolution for Multiple Arguments

We can use the overload resolution rules to select the most appropriate function when the efficiency or precision of computations differs significantly among types. For example:

int pow(int, int);
double pow(double, double);
complex pow(double, complex);
complex pow(complex, int);
complex pow(complex, complex);


void k(complex z)
{
int i = pow(2,2); //
invoke pow(int,int)
double d = pow(2.0,2.0); // invoke pow(double,double)
complex z2 = pow(2,z); // invoke pow(double,complex)
complex z3 = pow(z,2); // invoke pow(complex,int)
complex z4 = pow(z,z); // invoke pow(complex,complex)
}

In the process of choosing among overloaded functions with two or more arguments, a best match is found for each argument using the rules from §12.3. A function that is the best match for one argument and a better or equal match for all other arguments is called. If no such function exists, the call is rejected as ambiguous. For example:

void g()
{
double d = pow(2.0,2); //
error: pow(int(2.0),2) or pow(2.0,double(2))?
}

The call is ambiguous because 2.0 is the best match for the first argument of pow(double,double) and 2 is the best match for the second argument of pow(int,int).

12.3.5. Manual Overload Resolution

Declaring too few (or too many) overloaded versions of a function can lead to ambiguities. For example:

void f1(char);
void f1(long);

void f2(char*);
void f2(int*);

void k(int i)
{
f1(i); //
ambiguous: f1(char) or f1(long)?
f2(0); // ambiguous: f2(char*) or f2(int*)?
}

Where possible, consider the set of overloaded versions of a function as a whole and see if it makes sense according to the semantics of the function. Often the problem can be solved by adding a version that resolves ambiguities. For example, adding

inline void f1(int n) { f1(long(n)); }

would resolve all ambiguities similar to f1(i) in favor of the larger type long int.

One can also add an explicit type conversion to resolve a specific call. For example:

f2(static_cast<int*>(0));

However, this is most often simply an ugly stopgap. Soon another similar call will be made and have to be dealt with.

Some C++ novices get irritated by the ambiguity errors reported by the compiler. More experienced programmers appreciate these error messages as useful indicators of design errors.

12.4. Pre- and Postconditions

Every function has some expectations on its arguments. Some of these expectations are expressed in the argument types, but others depend on the actual values passed and on relationships among argument values. The compiler and linker can ensure that arguments are of the right types, but it is up to the programmer to decide what to do about “bad” argument values. We call logical criteria that are supposed to hold when a function is called preconditions, and logical criteria that are supposed to hold when a function returns its postconditions. For example:

int area(int len, int wid)
/*

calculate the area of a rectangle

precondition: len and wid are positive

postcondition: the return value is positive

postcondition: the return value is the area of a rectange with sides len and wid
*/
{
return len*wid;
}

Here, the statements of the pre- and postconditions are longer than the function body. This may seem excessive, but the information provided is useful to the implementer, to the users of area(), and to testers. For example, we learn that 0 and –12 are not considered valid arguments. Furthermore, we note that we could pass a couple of huge values without violating the precondition, but if len*wid overflows either or both of the postconditions are not met.

What should we do about a call area(numeric_limits<int>::max(),2)?

[1] Is it the caller’s task to avoid it? Yes, but what if the caller doesn’t?

[2] Is it the implementer’s task to avoid it? If so, how is an error to be handled?

There are several possible answers to these questions. It is easy for a caller to make a mistake and fail to establish a precondition. It is also difficult for an implementer to cheaply, efficiently, and completely check preconditions. We would like to rely on the caller to get the preconditions right, but we need a way to test for correctness. For now, just note that some pre- and postconditions are easy to check (e.g., len is positive and len*wid is positive). Others are semantic in nature and hard to test directly. For example, how do we test “the return value is the area of a rectangle with sides len and wid”? This is a semantic constraint because we have to know the meaning of “area of a rectangle,” and just trying to multiply len and wid again with a precision that precluded overflow could be costly.

It seems that writing out the pre- and postconditions for area() uncovered a subtle problem with this very simple function. This is not uncommon. Writing out pre- and postconditions is a great design tool and provides good documentation. Mechanisms for documenting and enforcing conditions are discussed in §13.4.

If a function depends only on its arguments, its preconditions are on its arguments only. However, we have to be careful about functions that depend on non-local values (e.g., a member function that depends on the state of its object). In essence, we have to consider every nonlocal value read as an implicit argument to a function. Similarly, the postcondition of a function without side effects simply states that a value is correctly computed, but if a function writes to nonlocal objects, its effect must be considered and documented.

The writer of a function has several alternatives, including:

[1] Make sure that every input has a valid result (so that we don’t have a precondition).

[2] Assume that the precondition holds (rely on the caller not to make mistakes).

[3] Check that the precondition holds and throw an exception if it does not.

[4] Check that the precondition holds and terminate the program if it does not.

If a postconditon fails, there was either an unchecked precondition or a programming error. §13.4 discusses ways to represent alternative strategies for checking.

12.5. Pointer to Function

Like a (data) object, the code generated for a function body is placed in memory somewhere, so it has an address. We can have a pointer to a function just as we can have a pointer to an object. However, for a variety of reasons – some related to machine architecture and others to system design – a pointer to function does not allow the code to be modified. There are only two things one can do to a function: call it and take its address. The pointer obtained by taking the address of a function can then be used to call the function. For example:

void error(string s) { /* ... */ }

void (*efct)(string); // pointer to function taking a string argument and returning nothing

void f()
{
efct = &error; //
efct points to error
efct("error"); // call error through efct
}

The compiler will discover that efct is a pointer and call the function pointed to. That is, dereferencing a pointer to function using * is optional. Similarly, using & to get the address of a function is optional:

void (*f1)(string) = &error; // OK: same as = error
void (*f2)(string) = error; // OK: same as = &error

void g()
{
f1("Vasa"); //
OK: same as (*f1)("Vasa")
(*f1)("Mary Rose"); // OK: as f1("Mary Rose")
}

Pointers to functions have argument types declared just like the functions themselves. In pointer assignments, the complete function type must match exactly. For example:

void (*pf)(string); // pointer to void(string)
void f1(string); // void(string)
int f2(string); // int(string)
void f3(int*); // void(int*)
void f()
{
pf = &f1; //
OK
pf = &f2; // error: bad return type
pf = &f3; // error: bad argument type

pf("Hera"); // OK
pf(1); // error: bad argument type

int i = pf("Zeus"); // error: void assigned to int
}

The rules for argument passing are the same for calls directly to a function and for calls to a function through a pointer.

You can convert a pointer to function to a different pointer-to-function type, but you must cast the resulting pointer back to its original type or strange things may happen:

using P1 = int(*)(int*);
using P2 = void(*)(void);

void f(P1 pf)
{
P2 pf2 = reinterpret_cast<P2>(pf)
pf2(); //
likely serious problem
P1 pf1 = reinterpret_cast<P1>(pf2); // convert pf2 "back again"
int x = 7;
int y = pf1(&x); //
OK
// ...
}

We need the nastiest of casts, reinterpret_cast, to do conversion of pointer-to-function types. The reason is that the result of using a pointer to function of the wrong type is so unpredictable and system-dependent. For example, in the example above, the called function may write to the object pointed to by its argument, but the call pf2() didn’t supply any argument!

Pointers to functions provide a way of parameterizing algorithms. Because C does not have function objects (§3.4.3) or lambda expressions (§11.4), pointers to functions are widely used as function arguments in C-style code. For example, we can provide the comparison operation needed by a sorting function as a pointer to function:

using CFT = int(const void*, const void*);

void ssort(void* base, size_t n, size_t sz, CFT cmp)
/*

Sort the "n" elements of vector "base" into increasing order
using the comparison function pointed to by "cmp".
The elements are of size "sz".

Shell sort (Knuth, Vol3, pg84)

*/

{
for (int gap=n/2; 0<gap; gap/=2)
for (int i=gap; i!=n; i++)
for (int j=i-gap; 0<=j; j-=gap) {
char* b = static_cast<char*>(base); //
necessary cast
char* pj = b+j*sz; // &base[j]
char* pjg = b+(j+gap)*sz; // &base[j+gap]
if (cmp(pjg,pj)<0) { // swap base[j] and base[j+gap]:
for (int k=0; k!=sz; k++) {
char temp = pj[k];
pj[k] = pjg[k];
pjg[k] = temp;
}
}
}
}

The ssort() routine does not know the type of the objects it sorts, only the number of elements (the array size), the size of each element, and the function to call to perform a comparison. The type of ssort() was chosen to be the same as the type of the standard C library sort routine, qsort(). Real programs use qsort(), the C++ standard-library algorithm sort32.6), or a specialized sort routine. This style of code is common in C, but it is not the most elegant way of expressing this algorithm in C++ (see §23.5, §25.3.4.1).

Such a sort function could be used to sort a table such as this:

struct User {
const char* name;
const char* id;
int dept;
};

vector<User> heads = {
"Ritchie D.M.", "dmr", 11271,
"Sethi R.", "ravi", 11272,
"Szymanski T.G.", "tgs", 11273,
"Schryer N.L.", "nls", 11274,
"Schryer N.L.", "nls", 11275,
"Kernighan B.W.", "bwk", 11276
};


void print_id(vector<User>& v)
{
for (auto& x : v)
cout << x.name << '\t' << x.id << '\t' << x.dept << '\n';
}

To be able to sort, we must first define appropriate comparison functions. A comparison function must return a negative value if its first argument is less than the second, zero if the arguments are equal, and a positive number otherwise:

int cmp1(const void* p, const void* q) // Compare name strings
{
return strcmp(static_cast<const User*>(p)–>name,static_cast<const User*>(q)–>name);
}

int cmp2(const void* p, const void* q) //
Compare dept numbers
{
return static_cast<const User*>(p)–>dept – static_cast<const User*>(q)–>dept;
}

There is no implicit conversion of argument or return types when pointers to functions are assigned or initialized. This means that you cannot avoid the ugly and error-prone casts by writing:

int cmp3(const User* p, const User* q) // Compare ids
{
return strcmp(p–>id,q–>id);
}

The reason is that accepting cmp3 as an argument to ssort() would violate the guarantee that cmp3 will be called with arguments of type const User* (see also §15.2.6).

This program sorts and prints:

int main()
{
cout << "Heads in alphabetical order:\n";
ssort(heads,6,sizeof(User),cmp1);
print_id(heads);
cout << '\n';


cout << "Heads in order of department number:\n";
ssort(heads,6,sizeof(User),cmp2);
print_id(heads);
}

To compare, we can equivalently write:

int main()
{
cout << "Heads in alphabetical order:\n";
sort(heads.begin(), head.end(),
[](const User& x, const User& y) { return x.name<y.name; }
);
print_id(heads);
cout << '\n';


cout << "Heads in order of department number:\n";
sort(heads.begin(), head.end(),
[](const User& x, const User& y) { return x.dept<y.dept; }
);
print_id(heads);
}

No mention of sizes is needed nor any helper functions. If the explicit use of begin() and end() is annoying, it can be eliminated by using a version of sort() that takes a container (§14.4.5):

sort(heads,[](const User& x, const User& y) { return x.name<y.name; });

You can take the address of an overloaded function by assigning to or initializing a pointer to function. In that case, the type of the target is used to select from the set of overloaded functions. For example:

void f(int);
int f(char);

void (*pf1)(int) = &f; //
void f(int)
int (*pf2)(char) = &f; // int f(char)
void (*pf3)(char) = &f; // error: no void f(char)

It is also possible to take the address of member functions (§20.6), but a pointer to member function is quite different from a pointer to (nonmember) function.

A pointer to a noexcept function can be declared noexcept. For example:

void f(int) noexcept;
void g(int);

void (*p1)(int) = f; //
OK: but we throw away useful information
void (*p2)(int) noexcept = f; // OK: we preserve the noexcept information
void (*p3)(int) noexcept = g; // error: we don't know that g doesn't throw

A pointer to function must reflect the linkage of a function (§15.2.6). Neither linkage specification nor noexcept may appear in type aliases:

using Pc = extern "C" void(int); // error: linkage specification in alias
using Pn = void(int) noexcept; // error: noexcept in alias

12.6. Macros

Macros are very important in C but have far fewer uses in C++. The first rule about macros is: don’t use them unless you have to. Almost every macro demonstrates a flaw in the programming language, in the program, or in the programmer. Because they rearrange the program text before the compiler proper sees it, macros are also a major problem for many programming support tools. So when you use macros, you should expect inferior service from tools such as debuggers, cross-reference tools, and profilers. If you must use macros, please read the reference manual for your own implementation of the C++ preprocessor carefully and try not to be too clever. Also, to warn readers, follow the convention to name macros using lots of capital letters. The syntax of macros is presented in §iso.16.3.

I recommend using macros only for conditional compilation (§12.6.1) and in particular for include guards (§15.3.3).

A simple macro is defined like this:

#define NAME rest of line

Where NAME is encountered as a token, it is replaced by rest of line. For example:

named = NAME

will expand into

named = rest of line

A macro can also be defined to take arguments. For example:

#define MAC(x,y) argument1: x argument2: y

When MAC is used, two argument strings must be presented. They will replace x and y when MAC() is expanded. For example:

expanded = MAC(foo bar, yuk yuk)

will be expanded into

expanded = argument1: foo bar argument2: yuk yuk

Macro names cannot be overloaded, and the macro preprocessor cannot handle recursive calls:

#define PRINT(a,b) cout<<(a)<<(b)
#define PRINT(a,b,c) cout<<(a)<<(b)<<(c) /*
trouble?: redefines, does not overload */

#define FAC(n) (n>1)?n*FAC(n–1):1 /* trouble: recursive macro */

Macros manipulate character strings and know little about C++ syntax and nothing about C++ types or scope rules. Only the expanded form of a macro is seen by the compiler, so an error in a macro will be reported when the macro is expanded, not when it is defined. This leads to very obscure error messages.

Here are some plausible macros:

#define CASE break;case
#define FOREVER for(;;)

Here are some completely unnecessary macros:

#define PI 3.141593
#define BEGIN {
#define END }

Here are some dangerous macros:

#define SQUARE(a) a*a
#define INCR_xx (xx)++

To see why they are dangerous, try expanding this:

int xx = 0; // global counter

void f(int xx)
{
int y = SQUARE(xx+2); //
y=xx+2*xx+2; that is, y=xx+(2*xx)+2
INCR_xx; // increments argument xx (not the global xx)
}

If you must use a macro, use the scope resolution operator, ::, when referring to global names (§6.3.4) and enclose occurrences of a macro argument name in parentheses whenever possible. For example:

#define MIN(a,b) (((a)<(b))?(a):(b))

This handles the simpler syntax problems (which are often caught by compilers), but not the problems with side effects. For example:

int x = 1;
int y = 10;
int z = MIN(x++,y++); //
x becomes 3; y becomes 11

If you must write macros complicated enough to require comments, it is wise to use / ** / comments because old C preprocessors that do not know about // comments are sometimes used as part of C++ tools. For example:

#define M2(a) something(a) /* thoughtful comment */

Using macros, you can design your own private language. Even if you prefer this “enhanced language” to plain C++, it will be incomprehensible to most C++ programmers. Furthermore, the preprocessor is a very simple-minded macro processor. When you try to do something nontrivial, you are likely to find it either impossible or unnecessarily hard to do. The auto, constexpr, const, decltype, enum, inline, lambda expressions, namespace, and template mechanisms can be used as better-behaved alternatives to many traditional uses of preprocessor constructs. For example:

const int answer = 42;

template<class T>
inline const T& min(const T& a, const T& b)
{
return (a<b)?a:b;
}

When writing a macro, it is not unusual to need a new name for something. A string can be created by concatenating two strings using the ## macro operator. For example:

#define NAME2(a,b) a##b

int NAME2(hack,cah)();

will produce

int hackcah();

A single # before a parameter name in a replacement string means a string containing the macro argument. For example:

#define printx(x) cout << #x " = " << x << '\n';

int a = 7;
string str = "asdf";

void f()
{
printx(a); //
cout << "a" << " = " << a << '\n';
printx(str); // cout << "str" << " = " << str << '\n';
}

Writing #x " = " rather than #x << " = " is obscure “clever code” rather than an error. Adjacent string literals are concatenated (§7.3.2).

The directive

#undef X

ensures that no macro called X is defined – whether or not one was before the directive. This affords some protection against undesired macros. However, it is not always easy to know what the effects of X on a piece of code were supposed to be.

The argument list (“replacement list”) of a macro can be empty:

#define EMPTY() std::cout<<"empty\n"
EMPTY(); //
print "empty\n"
EMPTY; // error: macro replacement list missing

I have a hard time thinking of uses of an empty macro argument list that are not error-prone or malicious.

Macros can even be variadic. For example:

#define err_print(...) fprintf(stderr,"error: %s %d\n", __VA_ARGS__)
err_print("The answer",54);

The ellipsis (...) means that __VA_ARGS__ represents the arguments actually passed as a string, so the output is:

error: The answer 54

12.6.1. Conditional Compilation

One use of macros is almost impossible to avoid. The directive

#ifdef IDENTIFIER

does nothing if IDENTIFIER is defined, but if it is not, the directive causes all input to be ignored until a #endif directive is seen. For example:

int f(int a
#ifdef arg_two
,int b
#endif
);

Unless a macro called arg_two has been #defined, this produces:

int f(int a
);

This example confuses tools that assume sane behavior from the programmer.

Most uses of #ifdef are less bizarre, and when used with restraint, #ifdef and its complement #ifndef do little harm. See also §15.3.3.

Names of the macros used to control #ifdef should be chosen carefully so that they don’t clash with ordinary identifiers. For example:

struct Call_info {
Node* arg_one;
Node* arg_two;
// ...

};

This innocent-looking source text will cause some confusion should someone write:

#define arg_two x

Unfortunately, common and unavoidable headers contain many dangerous and unnecessary macros.

12.6.2. Predefined Macros

A few macros are predefined by the compiler (§iso.16.8, §iso.8.4.1):

__cplusplus: defined in a C++ compilation (and not in a C compilation). Its value is 201103L in a C++11 program; previous C++ standards have lower values.

__DATE__: date in “yyyy:mm:dd” format.

__TIME__: time in “hh:mm:ss” format.

__FILE__: name of current source file.

__LINE__: source line number within the current source file.

__FUNC__: an implementation-defined C-style string naming the current function.

__STDC_HOSTED__: 1 if the implementation is hosted (§6.1.1); otherwise 0.

In addition, a few macros are conditionally defined by the implementation:

__STDC__: defined in a C compilation (and not in a C++ compilation)

__STDC_MB_MIGHT_NEQ_WC__: 1 if, in the encoding for wchar_t, a member of the basic character set (§6.1) might have a code value that differs from its value as an ordinary character literal

__STDCPP_STRICT_POINTER_SAFETY__: 1 if the implementation has strict pointer safety (§34.5); otherwise undefined.

__STDCPP_THREADS__: 1 if a program can have more than one thread of execution; otherwise undefined.

For example:

cout << __FUNC__ << "() in file " << __FILE__ << " on line " << __LINE__ << "\n";

In addition, most C++ implementations allow a user to define arbitrary macros on the command line or in some other form of compile-time environment. For example, NDEBUG is defined unless the compilation is done in (some implementation-specific) “debug mode” and is used by theassert() macro (§13.4). This can be useful, but it does imply that you can’t be sure of the meaning of a program just by reading its source text.

12.6.3. Pragmas

Implementations often provide facilities that differ from or go beyond what the standard offers. Obviously, the standard cannot specify how such facilities are provided, but one standard syntax is a line of tokens prefixed with the preprocessor directive #pragma. For example:

#pragma foo bar 666 foobar

If possible, #pragmas are best avoided.

12.7. Advice

[1] “Package” meaningful operations as carefully named functions; §12.1.

[2] A function should perform a single logical operation; §12.1.

[3] Keep functions short; §12.1.

[4] Don’t return pointers or references to local variables; §12.1.4.

[5] If a function may have to be evaluated at compile time, declare it constexpr; §12.1.6.

[6] If a function cannot return, mark it [[noreturn]]; §12.1.7.

[7] Use pass-by-value for small objects; §12.2.1.

[8] Use pass-by-const-reference to pass large values that you don’t need to modify; §12.2.1.

[9] Return a result as a return value rather than modifying an object through an argument; §12.2.1.

[10] Use rvalue references to implement move and forwarding; §12.2.1.

[11] Pass a pointer if “no object” is a valid alternative (and represent “no object” by nullptr); §12.2.1.

[12] Use pass-by-non-const-reference only if you have to; §12.2.1.

[13] Use const extensively and consistently; §12.2.1.

[14] Assume that a char* or a const char* argument points to a C-style string; §12.2.2.

[15] Avoid passing arrays as pointers; §12.2.2.

[16] Pass a homogeneous list of unknown length as an initializer_list<T> (or as some other container); §12.2.3.

[17] Avoid unspecified numbers of arguments (...); §12.2.4.

[18] Use overloading when functions perform conceptually the same task on different types; §12.3.

[19] When overloading on integers, provide functions to eliminate common ambiguities; §12.3.5.

[20] Specify preconditions and postconditions for your functions; §12.4.

[21] Prefer function objects (including lambdas) and virtual functions to pointers to functions; §12.5.

[22] Avoid macros; §12.6.

[23] If you must use macros, use ugly names with lots of capital letters; §12.6.