Metaprogramming - Abstraction Mechanisms - The C++ Programming Language (2013)

The C++ Programming Language (2013)

Part III: Abstraction Mechanisms

28. Metaprogramming

Trips to fairly unknown regions should be made twice; once to make mistakes and once to correct them.

– John Steinbeck

Introduction

Type Functions

Type Aliases; Type Predicates; Selecting a Function; Traits

Control Structures

Selection; Iteration and Recursion; When to Use Metaprogramming

Conditional Definition: Enable_if

Use of Enable_if; Implementing Enable_if; Enable_if and Concepts; More Enable_if Examples

A Compile-Time List: Tuple

A Simple Output Function; Element Access; make_tuple

Variadic Templates

A Type-Safe printf(); Technical Details; Forwarding; The Standard-Library tuple

SI Units Example

Units; Quantitys; Unit Literals; Utility Functions

Advice

28.1. Introduction

Programming that manipulates program entities, such as classes and functions, is commonly called metaprogramming. I find it useful to think of templates as generators: they are used to make classes and functions. This leads to the notion of template programming as an exercise in writing programs that compute at compile time and generate programs. Variations of this idea have been called two-level programming, multilevel programming, generative programming, and – more commonly – template metaprogramming.

There are two main reasons for using metaprogramming techniques:

Improved type safety: We can compute the exact types needed for a data structure or algorithm so that we don’t need to directly manipulate low-level data structures (e.g., we can eliminate many uses of explicit type conversion).

Improved run-time performance: We can compute values at compile time and select functions to be called at run time. That way, we don’t need to do those computations at run time (e.g., we can resolve many examples of polymorphic behavior to direct function calls). In particular, by taking advantage of the type system, we can dramatically improve the opportunities for inlining. Also, by using compact data structures (possibly generated; §27.4.2, §28.5), we make better use of memory with positive effects on both the amount of data we can handle and the execution speed.

Templates were designed to be very general and able to generate optimal code [Stroustrup,1994]. They provide arithmetic, selection, and recursion. In fact, they constitute a complete compile-time functional programming language [Veldhuizen,2003]. That is, templates and their template instantiation mechanism are Turing complete. One demonstration of this was that Eisenecker and Czarnecki wrote a Lisp interpreter in only a few pages using templates [Czarnecki,2000]. The C++ compile-time mechanisms provide a pure functional programming language: You can create values of various types, but there are no variables, assignments, increment operators, etc. The Turing completeness implies the possibility of infinite compilations, but that’s easily taken care of by translation limits (§iso.B). For example, an infinite recursion will be caught by running out of some compile-time resource, such as the number of recursive constexpr calls, the number of nested classes, or the number of recursively nested template instantiations.

Where should we draw the line between generic programming and template metaprogramming? The extreme positions are:

• It is all template metaprogramming: after all, any use of compile-time parameterization implies instantiation that generates “ordinary code.”

• It is all generic programming: after all, we are just defining and using generic types and algorithms.

Both of these positions are useless because they basically define generic programming and template metaprogramming as synonyms. I think there is a useful distinction to be made. A distinction helps us decide between alternative approaches to problems and to focus on what is important for a given problem. When I write a generic type or algorithm, I don’t feel that I am writing a compile-time program. I am not using my programming skills for the compile-time part of my program. Instead, I am focusing on defining requirements on arguments (§24.3). Generic programming is primarily a design philosophy – a programming paradigm, if you must (§1.2.1).

In contrast, metaprogramming is programming. The emphasis is on computation, often involving selection and some form of iteration. Metaprogramming is primarily a set of implementation techniques. I can think of four levels of implementation complexity:

[1] No computation (just pass type and value arguments)

[2] Simple computation (on types or values) not using compile-time tests or iteration, for example, && of Booleans (§24.4) or addition of units (§28.7.1)

[3] Computation using explicit compile-time tests, for example, a compile-time if28.3).

[4] Computation using compile-time iteration (in the form of recursion; §28.3.2).

The ordering indicates the level of complexity, with implications for the difficulty of the task, the difficulty of debugging, and the likelihood of error.

So, metaprogramming is a combination of “meta” and programming: a metaprogram is a compile-time computation yielding types or functions to be used at run time. Note that I don’t say “template metaprogramming” because the computation may be done using constexpr functions. Note also that you can rely on other people’s metaprogramming without actually doing metaprogramming yourself: calling a constexpr function hiding a metaprogram (§28.2.2) or extracting the type from a template type function (§28.2.4) is not in itself metaprogramming; it just uses a metaprogram.

Generic programming usually falls into the first, “no computation” category, but it is quite possible to support generic programming using metaprogramming techniques. When doing so, we have to be careful that our interface specifications are precisely defined and correctly implemented. Once we use (meta)programming as part of an interface, the possibility of programming errors creeps in. Without programming, the meaning is directly defined by the language rules.

Generic programming focuses on interface specification, whereas metaprogramming is programming, usually with types as the values.

Overenthusiastic use of metaprogramming can lead to debugging problems and excessive compile times that render some uses unrealistic. As ever, we have to apply common sense. There are many simple uses of metaprogramming that lead to better code (better type safety, lower memory footprint, and lower run time) without exceptional compile-time overhead. Many standard-library components, such as function33.5.3), thread5.3.1, §42.2.2), and tuple34.2.4.2), are examples of relatively simple application of metaprogramming techniques.

This chapter explores the basic metaprogramming techniques and presents the basic building blocks of metaprograms. Chapter 29 offers a more extensive example.

28.2. Type Functions

A type function is a function that either takes at least one type argument or produces at least one type as a result. For example, sizeof(T) is a built-in type function that given a type argument T returns the size of an object (measured in chars; §6.2.8).

Type functions don’t have to look like conventional functions. In fact, most don’t. For example, the standard library’s is_polymorphic<T> takes its argument as a template argument and returns its result as a member called value:

if (is_polymorphic<int>::value) cout << "Big surprise!";

The value member of is_polymorphic is either true or false. Similarly, the standard-library convention is that a type function that returns a type does so by a member called type. For example:

enum class Axis : char { x, y, z };
enum flags { off, x=1, y=x<<1, z=x<<2, t=x<<3 };


typename std::underlying_type<Axis>::type x; // x is a char
typename std::underlying_type<Axis>::type y; // y is probably an int (§8.4.2)

A type function can take more than one argument and return several result values. For example:

template<typename T, int N>
struct Array_type {
using type = T;
static const int dim = N;
//
...
};

This Array_type is not a standard-library function or even a particularly useful function. I just used it as an excuse to show how to write a simple multi-argument, multi-return-value type function. It can be used like this:

using Array = Array_type<int,3>;

Array::type x; // x is an int
constexpr int s = Array::dim; // s is 3

Type functions are compile-time functions. That is, they can only take arguments (types and values) that are known at compile time and produce results (types and values) that can be used at compile time.

Most type functions take at least one type argument, but there are useful ones that don’t. For example, here is a type function that returns an integer type of the appropriate number of bytes:

template<int N>
struct Integer {
using Error = void;
using type = Select<N,Error,signed char,short,Error,int,Error,Error,Error,long>;
};


typename Integer<4>::type i4 = 8; // 4-byte integer
typename Integer<1>::type i1 = 9; // 1-byte integer

Select is defined and explained in §28.3.1.3. It is of course possible to write templates that take values only and produce values only. I don’t consider those type functions. Also, constexpr functions (§12.1.6) are usually a better way of expressing compile-time computations on values. I can compute a square root at compile time using templates, but why would I want to when I can express the algorithm more cleanly using constexpr functions (§2.2.3, §10.4, §28.3.2)?

So, C++ type functions are mostly templates. They can perform very general computations using types and values. They are the backbone of metaprogramming. For example, we might want to allocate an object on the stack provided that it is small and on the free store otherwise:

constexpr int on_stack_max = sizeof(std::string); // max size of object we want on the stack

template<typename T>
struct Obj_holder {
using type = typename std::conditional<(sizeof(T)<=on_stack_max),

Scoped<T>, // first alternative
On_heap<T> // second alternative
>::type;
};

The standard-library template conditional is a compile-time selector between two alternatives. If its first argument evaluates to true, the result (presented as the member type) is the second argument; otherwise, the result is the third argument. §28.3.1.1 shows how conditional is implemented. In this case, Obj_holder<X>’s type is defined to be Scoped<X> if an object of X is small and On_heap<X> if it is large. Obj_holder can be used like this:

void f()
{
typename Obj_holder<double>::type v1; //
the double goes on the stack
typename Obj_holder<array<double,200>>::type v2; // the array goes on the free store
// ...
*v1 = 7.7; // Scoped provides pointer-like access (* and [])
v2[77] = 9.9; // On_heap provides pointer-like access (* and [])
// ...
}

The Obj_holder example is not hypothetical. For example, the C++ standard contains the following comment in its definition of the function type (§33.5.3) for holding function-like entities: “Implementations are encouraged to avoid the use of dynamically allocated memory for small callable objects, for example, where f’s target is an object holding only a pointer or reference to an object and a member function pointer” (§iso.20.8.11.2.1). It would be hard to follow that advice without something like Obj_holder.

How are Scoped and On_heap implemented? Their implementations are trivial and do not involve any metaprogramming, but here they are:

template<typename T>
struct On_heap {

On_heap() :p(new T) {} // allocate
~On_heap() { delete p; } // deallocate

T& operator*() { return *p; }
T*
operator–>() { return p; }

On_heap(const On_heap&) = delete; // prevent copying
On_heap operator=(const On_heap&) = delete;
private:

T* p; // pointer to object on the free store
};

template<typename T>
struct Scoped {

T& operator*() { return x; }
T*
operator–>() { return &x; }

Scoped(const Scoped&) = delete; // prevent copying
Scoped operator=(const Scoped&) = delete;
private:

T x; // the object
};

On_heap and Scoped provide good examples of how generic programming and template metaprogramming require us to devise uniform interfaces to different implementations of a general idea (here, the idea of allocation of an object).

Both On_heap and Scoped can be used as members as well as local variables. On_heap always places its object on the free store, whereas Scoped contains its object.

§28.6 shows how we can implement versions of On_heap and Scoped for types that take constructor arguments.

28.2.1. Type Aliases

Note how the implementation details of Obj_holder (as for Int) shine through when we use typename and ::type to extract the member type. This is a consequence of the way the language is specified and used, this is the way template metaprogramming code has been written for the last 15 years, and this is the way it appears in the C++11 standard. I consider it insufferable. It reminds me of the bad old days in C, where every occurrence of a user-defined type had to be prefixed with the struct keyword. By introducing a template alias (§23.6), we can hide the ::typeimplementation details and make a type function look much more like a function returning a type (or like a type). For example:

template<typename T>
using Holder = typename Obj_holder<T>::type;


void f2()
{

Holder<double> v1;
// the double goes on the stack
Holder<array<double,200>> v2; // the array goes on the free store
// ...
*v1 = 7.7; // Scoped provides pointer-like access (* and [])
v2[77] = 9.9; // On_heap provides pointer-like access (* and [])
// ...
}

Except when explaining an implementation or what the standard specifically offers, I use such type aliases systematically. When the standard provides a type function (called something like “type property predicate” or “composite type category predicate”), such as conditional, I define a corresponding type alias (§35.4.1):

template<typename C, typename T, typename F>
using Conditional = typename std::conditional<C,T,F>::type;

Please note that these aliases are unfortunately not part of the standard.

28.2.1.1. When Not to Use an Alias

There is one case in which it is significant to use ::type directly, rather than an alias. If only one of the alternatives is supposed to be a valid type, we should not use an alias. Consider first a simple analogy:

if (p) {
p–>f(7);
//
...
}

It is important that we don’t enter the block if p is the nullptr. We are using the test to see if p is valid. Similarly, we might want to test to see if a type is valid. For example:

conditional<
is_integral<T>::value,
make_unsigned<T>,
Error<T>
>::type

Here, we test if T is an integral type (using the std::is_integral type predicate) and make the unsigned variant of that type if it is (using the std::make_unsigned type function). If that succeeds, we have an unsigned type; otherwise, we will have to deal with the Error indicator.

Had we written Make_unsigned<T> meaning

typename make_unsigned<T>::type

and tried to use it for a nonintegral type, say std::string, we would have tried to make a nonexistent type (make_unsigned<std::string>::type). The result would have been a compile-time error.

In the rare cases where we can’t use aliases consistently to hide ::type, we can fall back on the more explicit, implementation-oriented ::type style. Alternatively, we can introduce a Delay type function to delay evaluation of a type function until its use:

Conditional<
is_integral<T>::value,
Delay<Make_unsigned,T>,
Error<T>
>

The implementation of a perfect Delay function is nontrivial, but for many uses this will do:

template<template<typename...> class F, typename... Args>
using Delay = F<Args...>;

This uses a template template argument (§25.2.4) and variadic templates (§28.6).

Independently of which solution we choose to avoid the undesired instantiation, this is the kind of expert territory that I enter only with some trepidation.

28.2.2. Type Predicates

A predicate is a function that returns a Boolean value. If you want to write functions that take arguments that are types, it seems obvious that you’ll like to ask questions about the arguments’ types. For example: Is this a signed type? Is this type polymorphic (i.e., does it have at least one virtual function)? Is this type derived from that type?

The answers to many such questions are known to the compiler and exposed to the programmer through a set of standard-library type predicates (§35.4.1). For example:

template<typename T>
void copy(T*
p, const T* q, int n)
{
if (std::is_pod<T>::value)
memcpy(p,q,n); //
use optimized memory copy
else
for (int i=0; i!=n; ++i)
p[i] = q[i]; //
copy individual values
}

Here, we try to optimize the copy by using the (supposedly optimal) standard-library function memcpy() when we can treat the objects as “plain old data” (POD; §8.2.6). If not, we copy the objects one by one (potentially) using their copy constructor. We determine whether the template argument type is a POD by the standard-library type predicate is_pod. The result is presented by the member value. This standard-library convention is similar to the way type functions present their result as a member type.

The std::is_pod predicate is one of the many provided by the standard library (§35.4.1). Since the rules for being a POD are tricky, is_pod is most likely a compiler intrinsic rather than implemented in the library as C++ code.

Like the ::type convention, the value ::value causes verbosity and is a departure from conventional notation that lets implementation details shine through: a function returning a bool should be called using ():

template<typename T>
void copy(T*
p, const T* q, int n)
{
if (is_pod<T>())
//
...
}

Fortunately, the standard supports that for all standard-library type predicates. Unfortunately, for language-technical reasons, this resolution is not available in the context of a template argument. For example:

template<typename T>
void do_something()
{

Conditional<is_pod<T>(),On_heap<T>,Scoped<Y>) x; // error: is_pod<T>() is a type
// ..
}

In particular, is_pod<T> is interpreted as the type of a function taking no argument and returning an is_pod<T> (§iso.14.3[2]).

My solution is to add functions to provide the conventional notation in all contexts:

template<typename T>
constexpr bool Is_pod()
{
return std::is_pod<T>::value;
}

I capitalize the names of these type functions to avoid confusion with the standard-library versions. In addition, I keep them in a separate namespace (Estd).

We can define our own type predicates. For example:

template<typename T>
constexpr bool Is_big()
{
return 100<sizeof(T);
}

We could use this (rather crude) notion of “big” like this:

template<typename T>
using Obj_holder = Conditional<(Is_big<T>()), Scoped<T>, On_heap<T>>;

It is rarely necessary to define predicates that directly reflect basic properties of types because the standard library provides many of those. Examples are is_integral, is_pointer, is_empty, is_polymorphic, and is_move_assignable35.4.1). When we have to define such predicates, we have rather powerful techniques available. For example, we can define a type function to determine whether a class has a member of a given name and of an appropriate type (§28.4.4).

Naturally, type predicates with more than one argument can also be useful. In particular, this is how we represent relations between two types, such as is_same, is_base_of, and is_convertible. These are also from the standard library.

I use Is_* constexpr functions to support the usual () calling syntax for all of these is_ *functions.

28.2.3. Selecting a Function

A function object is an object of some type, so the techniques for selecting types and values can be used to select a function. For example:

struct X { // write X
void operator()(int x) { cout <<"X&" << x << "!"; }
//
...
};

struct Y { // write Y
void operator()(int y) { cout <<"Y" << y << "!"; }
//
...
};

void f()
{

Conditional<(sizeof(int)>4),X,Y>{}(7); // make an X or a Y and call it

using Z = Conditional<(Is_polymorphic<X>()),X,Y>;
Z zz; //
make an X or a Y
zz(7); // call an X or a Y
}

As shown, a selected function object type can be used immediately or “remembered” for later use.

Classes with member functions computing some value are the most general and flexible mechanism for computation in template metaprogramming.

Conditional is a mechanism for compile-time programming. In particular, this means that the condition must be a constant expression. Note the parentheses around sizeof(int)>4; without those, we would have gotten a syntax error because the compiler would have interpreted the > as the end of the template argument list. For that reason (and others), I prefer to use < (less than) rather than > (greater than). Also, I sometimes use parentheses around conditions for readability.

28.2.4. Traits

The standard library relies heavily on traits. A trait is used to associate properties with a type. For example, the properties of an iterator are defined by its iterator_traits33.1.3):

template<typename Iterator>
struct iterator_traits {
using difference_type = typename Iterator::difference_type;
using value_type = typename Iterator::value_type;
using pointer = typename Iterator::pointer;
using reference = typename Iterator::reference;
using iterator_category = typename Iterator::iterator_category;
};

You can see a trait as a type function with many results or as a bundle of type functions.

The standard library provides allocator_traits34.4.2), char_traits36.2.2), iterator_traits33.1.3), regex_traits37.5), pointer_traits34.4.3). In addition, it provides time_traits35.2.4) and type_traits35.4.1), which confusingly are simple type functions.

Given iterator_traits for a pointer, we can talk about the value_type and the difference_type of a pointer even though pointers don’t have members:

template<typename Iter>
Iter search(Iter p, Iter q, typename iterator_traits<Iter>::value_type val)
{
typename iterator_traits<Iter>::difference_type m = q–p;
//
...
}

This is a most useful and powerful technique, but:

• It is verbose.

• It often bundles otherwise weakly related type functions.

• It exposes implementation details to users.

Also, people sometimes throw in type aliases “just in case,” leading to unneccesary complexity. Consequently, I prefer to use simple type functions:

template<typename T>
using Value_type = typename std::iterator_trait<T>::value_type;


template<typename T>
using Difference_type = typename std::iterator_trait<T>::difference_type;


template<typename T>
using Iterator_category= typename std::iterator_trait<T>::iterator_category;

The example cleans up nicely:

template<typename Iter>
Iter search(Iter p, iter q, Value_type<Iter> val)
{

Difference_type<Iter> m = q–p;
//
...
}

I suspect that traits are currently overused. Consider how to write the previous example without any mention of traits or other type functions:

template<typename Iter, typename Val>
Iter search(Iter p, iter q, Val val)
{
auto x =
*p; // if we don't need to name *p's type
auto m = q–p; // if we don't need to name q-p's type

using value_type = decltype(*p); // if we want to name *p's type
using difference_type = decltype(q–p); // if we want to name q-p's type

// ...
}

Of course, decltype() is a type function, so all I did was to eliminate user-defined and standard-library type functions. Also, auto and decltype are new in C++11, so older code could not have been written this way.

We need a trait (or equivalent, such as decltype()) to associate a type with another type, such as a value_type with a T*. For that, a trait (or an equivalent) is indispensable for non-intrusively adding type names needed for generic programming or metaprogramming. When a trait is used simply to provide a name for something that already has a perfectly good name, such as pointer for value_type* and reference for value_type&, the utility is less clear and the potential for confusion greater. Don’t blindly define traits for everything “just in case.”

28.3. Control Structures

To do general computation at compile time, we need selection and recursion.

28.3.1. Selection

In addition to what is trivially done using ordinary constant expressions (§10.4), I use:

Conditional: a way of choosing between two types (an alias for std::conditional)

Select: a way of choosing among several types (defined in §28.3.1.3)

These type functions return types. If you want to choose among values, ?: is sufficient; Conditional and Select are for selecting types. They are not simply compile-time equivalents to if and switch even though they can appear to be when they are used to choose among function objects (§3.4.3, §19.2.2).

28.3.1.1. Selecting between Two Types

It is surprisingly simple to implement Conditional, as used in §28.2. The conditional template is part of the standard library (in <type_traits>), so we don’t have to implement it, but it illustrates an important technique:

template<bool C, typename T, typename F> // general template
struct conditional {
using type = T;
};


template<typename T, typename F> // specialization for false
struct conditional<false,T,F> {
using type = F;
};

The primary template (§25.3.1.1) simply defines its type to be T (the first template parameter after the condition). If the condition is not true, the specialization for false is chosen and type is defined to be F. For example:

typename conditional<(std::is_polymorphic<T>::value),X,Y>::type z;

Obviously, the syntax leaves a bit to be desired (§28.2.2), but the underlying logic is beautiful.

Specialization is used to separate the general case from one or more specialized ones (§25.3). In this example, the primary template takes care of exactly half of the functionality, but that fraction can vary from nothing (every nonerroneous case is handled by a specialization; §25.3.1.1) to all but a single terminating case (§28.5). This form of selection is completely compile-time and doesn’t cost a byte or a cycle at run time.

To improve the syntax, I introduce a type alias:

template<bool B, typename T, typename F>
using Conditional = typename std::conditional<B,T,F>::type;

Given that, we can write:

Conditional<(Is_polymorphic<T>()),X,Y> z;

I consider that a significant improvement.

28.3.1.2. Compile Time vs. Run Time

Looking at something like

Conditional<(std::is_polymorphic<T>::value),X,Y> z;

for the first time, it is not uncommon for people to think, “Why don’t we just write a normal if?” Consider having to choose between two alternatives, Square and Cube:

struct Square {
constexpr int operator()(int i) { return i*i; }
};


struct Cube {
constexpr int operator()(int i) { return i*i*i; }
};

We might try the familiar if-statement:

if (My_cond<T>())
using Type = Square; //
error: declaration as if-statement branch
else
using Type = Cube; //
error: declaration as if-statement branch

Type x; // error: Type is not in scope

A declaration cannot be the only statement of a branch of an if-statement (§6.3.4, §9.4.1), so this will not work even though My_cond<T>() is computed at compile time. Thus, an ordinary if-statement is useful for ordinary expressions, but not for type selection.

Let us try an example that doesn’t involve defining a variable:

Conditional<My_cond<T>(),Square,Cube>{}(99); // invoke Square{}(99) or Cube{}(99)

That is, select a type, construct a default object of that type, and call it. That works. Using “conventional control structures,” this would become:

((My_cond<T>())?Square:Cube){}(99);

This example doesn’t work because Square{}(99) and Cube{}(99) do not yield types, rather than values of types that are compatible in a conditional expression (§11.1.3). We could try

(My_cond<T>()?Square{}:Cube{})(99); // error: incompatible arguments for ?:

Unfortunately, this version still suffers from the problem that Square{} and Cube{} are not of compatible types acceptable as alternatives in a ?: expression. The restriction to compatible types is often unacceptable in metaprogramming because we need to choose between types that are not explicitly related.

Finally, this works:

My_cond<T>()?Square{}(99):Cube{}(99);

Furthermore, it is not significantly more readable than

Conditional<My_cond<T>(),Square,Cube>{}(99);

28.3.1.3. Selecting among Several Types

Selecting among N alternatives is very similar to choosing between two. Here is a type function returning its Nth argument type:

class Nil {};

template<int I, typename T1 =Nil, typename T2 =Nil, typename T3 =Nil, typename T4 =Nil>
struct select;


template<int I, typename T1 =Nil, typename T2 =Nil, typename T3 =Nil, typename T4 =Nil>
using Select = typename select<I,T1,T2,T3,T4>::type;


// Specializations for 0-3:

template<typename T1, typename T2, typename T3, typename T4>
struct select<0,T1,T2,T3,T4> { using type = T1; }; //
specialize for N==0

template<typename T1, typename T2, typename T3, typename T4>
struct select<1,T1,T2,T3,T4> { using type = T2; }; //
specialize for N==1

template<typename T1, typename T2, typename T3, typename T4>
struct select<2,T1,T2,T3,T4> { using type = T3; }; //
specialize for N==2

template<typename T1, typename T2, typename T3, typename T4>
struct select<3,T1,T2,T3,T4> { using type = T4; }; //
specialize for N==3

The general version of select should never be used, so I didn’t define it. I chose zero-based numbering to match the rest of C++. This technique is perfectly general: such specializations can present any aspect of the template arguments. We don’t really want to pick a maximum number of alternatives (here, four), but that problem can be addressed using variadic templates (§28.6). The result of picking a nonexisting alternative is to use the primary (general) template. For example:

Select<5,int,double,char> x;

In this case, that would lead to an immediate compile-time error as the general Select isn’t defined.

A realistic use would be to select the type for a function returning the Nth element of a tuple:

template<int N, typename T1, typename T2, typename T3, typename T4>
Select<N,T1,T2,T3,T4> get(Tuple<T1,T2,T3,T4>& t); //
see §28.5.2

auto x = get<2>(t); // assume that t is a Tuple

Here, the type of x will be whatever T3 is for the Tuple called t. Indexing into tuples is zero-based.

Using variadic templates (§28.6), we can provide a far simpler and more general select:

template<unsigned N, typename... Cases> // general case; never instantiated
struct select;

template<unsigned N, typename T, typename... Cases>
struct select<N,T,Cases...> :select<N–1,Cases...> {
};


template<typename T, typename... Cases> // final case: N==0
struct select<0,T,Cases...> {
using type = T;
};


template<unsigned N, typename... Cases>
using Select = typename select<N,Cases...>::type;

28.3.2. Iteration and Recursion

The basic techniques for calculating a value at compile time can be illustrated by a factorial function template:

template<int N>
constexpr int fac()
{
return N*fac<N–1>();
}}


template<>
constexpr int fac<1>()
{
return 1;
}


constexpr int x5 = fac<5>();

The factorial is implemented using recursion, rather than using a loop. Since we don’t have variables at compile time (§10.4), that makes sense. In general, if we want to iterate over a set of values at compile time, we use recursion.

Note the absence of a condition: there is no N==1 or N<2 test. Instead, the recursion is terminated when the call of fac() selects the specialization for N==1. In template metaprogramming (as in functional programming), the idiomatic way of working your way through a sequence of values is to recurse until you reach a terminating specialization.

In this case, we could also do the computation in the more conventional way:

constexpr int fac(int i)
{
return (i<2)?1:fac(i–1);
}


constexpr int x6 = fac(6);

I find this clearer than the function template expression of the idea, but tastes vary and there are algorithms that are best expressed by separating the terminating case from the general case. The non-template version is marginally easier for a compiler to handle. The run-time performance will, of course, be identical.

The constexpr version can be used for both compile-time and run-time evaluation. The template (metaprogramming) version is for compile-time use only.

28.3.2.1. Recursion Using Classes

Iteration involving more complicated state or more elaborate parameterization can be handled using classes. For example, the factorial program becomes:

template<int N>
struct Fac {
static const int value = N*Fac<N–1>::value;
};


template<>
struct Fac<1> {
static const int value = 1;
};


constexpr int x7 = Fac<7>::value;

For a more realistic example, see §28.5.2.

28.3.3. When to Use Metaprogramming

Using the control structures described here, you can compute absolutely everything at compile time (translation limits permitting). The question remains: Why would you want to? We should use these techniques if and when they yield cleaner, better-performing, and easier-to-maintain code than alternative techniques. The most obvious constraint on metaprogramming is that code depending on complicated uses of templates can be hard to read and very hard to debug. Nontrivial uses of templates can also impact compile times. If you have a hard time understanding what is going on in code requiring complicated patterns of instantiation, so might the compiler. Worse still, so may the programmer who gets to maintain your code.

Template metaprogramming attracts clever people:

• Partly, that’s because metaprogramming allows us to express things that simply can’t be done at the same level of type safety and run-time performance. When the improvements are significant and the code maintainable, these are good – sometimes even compelling – reasons.

• Partly, that’s because metaprogramming allows us to show off our cleverness. Obviously, that is to be avoided.

How would you know that you have gone too far with metaprogramming? One warning sign that I use is an urge to use macros (§12.6) to hide “details” that have become too ugly to deal with directly. Consider:

#define IF(c,x,y) typename std::conditional<(c),x,y>::type

Is this going too far? It allows us to write

IF(cond,Cube,Square) z;

rather than

typename std::conditional<(cond),Cube,Square>::type z;

I have biased the question by using the very short name IF and the long form std::conditional.

Similarly, a more complex condition would almost equalize the number of characters used. The fundamental difference is that I have to write typename and ::type to use the standard’s terminology. That exposes the template implementation technique. I would like to hide that, and the macro does. However, if many people need to collaborate and programs get large, a bit of verbosity is preferable to a divergence of notations.

Another serious argument against the IF macro is that its name is misleading: conditional is not a “drop-in replacement” for a conventional if. That ::type represents a significant difference: conditional selects between types; it does not directly alter control flow. Sometimes it is used to select a function and thus represent a branch in a computation; sometimes it is not. The IF macro hides an essential aspect of its function. Similar objections can be leveled at many other “sensible” macros: they are named for some programmer’s particular idea of their use, rather than reflecting fundamental functionality.

In this case, the problems of verbosity, of the implementation details leaking out, and of poor naming are easily handled by a type alias (Conditional; §28.2.1). In general, look hard for ways to clean up the syntax presented to users without inventing a private language. Prefer systematic techniques, such as specialization and the use of aliases, to macro hackery. Prefer constexpr functions to templates for compile-time computation, and hide template metaprogramming implementation details in constexpr functions whenever feasible (§28.2.2).

Alternatively, we can look at the fundamental complexity of what we are trying to do:

[1] Does it require explicit tests?

[2] Does it require recursion?

[3] Can we write concepts (§24.3) for our template arguments?

If the answer to question [1] or [2] is “yes” or the answer to question [3] is “no,” we should consider whether there may be maintenance problems. Maybe some form of encapsulation is possible? Remember that complexities of a template implementation become visible to users (“leak out”) whenever an instantiation fails. Also, many programmers do look into header files where every detail of a metaprogram is immediately exposed.

28.4. Conditional Definition: Enable_if

When we write a template, we sometimes want to provide an operation for some template arguments, but not for others. For example:

template<typename T>
class Smart_pointer {

// ...
T& operator*(); // return reference to whole object
T* operator–>(); // select a member (for classes only)
// ...
}

If T is a class, we should provide operator–>(), but if T is a built-in type, we simply cannot do so (with the usual semantics). Therefore, we want a language mechanism for saying, “If this type has this property, define the following.” We might try the obvious:

template<typename T>
class Smart_pointer {

// ...
T& operator*(); // return reference to whole object
if (Is_class<T>()) T* operator–>(); // syntax error
// ...
}

However, that does not work. C++ does not provide an if that can select among definitions based on a general condition. But, as with Conditional and Select28.3.1), there is a way. We can write a somewhat curious type function to make the definition of operator–>() conditional. The standard library (in <type_traits>) provides enable_if for that. The Smart_pointer example becomes:

template<typename T>
class Smart_pointer {

// ...
T& operator*(); // return reference to whole object
Enable_if<Is_class<T>(),T>* operator–>(); // select a member (for classes only)
// ...
}

As usual, I have used type aliases and constexpr functions to simplify the notation:

template<bool B, typename T>
using Enable_if = typename std::enable_if<B,T>::type;


template<typename T> bool Is_class()
{
return std::is_class<T>::value;
}

If Enable_if’s condition evaluates to true, its result is its second argument (here, T). If Enable_if’s condition evaluates to false, the whole function declaration of which it is part is completely ignored. In this case, if T is a class, we get a definition of operator–>() returning a T*, and if it is not, we don’t declare anything.

Given the definition of Smart_pointer using Enable_if, we get:

void f(Smart_pointer<double> p, Smart_pointer<complex<double>> q)
{
auto d0 =
*p; // OK
auto c0 = *q; // OK
auto d1 = q–>real(); // OK
auto d2 = p–>real(); // error: p doesn't point to a class object
// ...
}

You may consider Smart_pointer and operator–>() exotic, but providing (defining) operations conditionally is very common. The standard library provides many examples of conditional definition, such as Alloc::size_type34.4.2) and pair being movable if both of their elements are (§34.2.4.1). The language itself defines –> only for pointers to class objects (§8.2).

In this case, the elaboration of the declaration of operator–>() with Enable_if simply changes the kind of error we get from examples, such as p–>real():

• If we unconditionally declare operator–>(), we get a “–> used on a non-class pointer” error at instantiation time for the definition of Smart_pointer<double>::operator–>().

• If we conditionally declare operator–>() using Enable_if, if we use –> on a smart_ptr<double>, we get a “Smart_ptr<double>::operator–>() not defined” error at the point of use of Smart_ptr<double>::operator–>().

In either case, we do not get an error unless we use –> on a smart_ptr<T> where T is not a class.

We have moved the error detection and reporting from the implementation of smart_pointer<T>::operator–>() to its declaration. Depending on the compiler and especially on how deep in a nest of template instantiations the error happens, this can make a significant difference. In general, it is preferable to specify templates precisely so as to detect errors early rather than relying on bad instantiations being caught. In this sense, we can see Enable_if as a variant of the idea of a concept (§24.3): it allows a more precise specification of the requirements of a template.

28.4.1. Use of Enable_if

For many uses, the functionality of enable_if is pretty ideal. However, the notation we have to use is often awkward. Consider:

Enable_if<Is_class<T>(),T> *operator–>();

The implementation shows through rather dramatically. However, what is actually expressed is pretty close to the minimal ideal:

declare_if (Is_class<T>()) T* operator–>(); // not C++

However, C++ does not have a declare_if construct for selecting declarations.

Using Enable_if to decorate the return type places it up front where you can see it and where it logically belongs, because it affects the whole declaration (not just the return type). However, some declarations do not have a return type. Consider two of vector’s constructors:

template<typename T>
class vector<T> {
public:
vector(size_t n, const T& val); //
n elements of type T with value val

template<typename Iter>
vector(Iter b, Iter e); //
initialize from [b:e)
// ...
};

This looks innocent enough, but the constructor taking a number of elements wrecks its usual havoc. Consider:

vector<int> v(10,20);

Is that 10 elements with the value 20 or an attempt to initialize from [10:20]? The standard requires the former, but the code above would naively pick the latter because an int-to-size_t conversion is required for the first constructor whereas the pair of ints is a perfect match for the templateconstructor. The problem is that I “forgot” to tell the compiler that the Iter type should be an iterator. However, that can be done:

template<typename T>
class vector<T> {
public:
vector(size_t n, const T& val); //
n elements of type T with value val

template<typename Iter, typename =Enable_if<Input_iterator<Iter>(),Iter>>
vector(Iter b, Iter e); //
initialize from [b:e)
// ...
};

That (unused) default template argument will be instantiated because we certainly can’t deduce that unused template parameter. This implies that the declaration of vector(Iter,Iter) will fail unless Iter is an Input_iterator24.4.4).

I introduced the Enable_if as a default template argument because that is the most general solution. It can be used for templates without arguments and/or without return types. However, in this case, we could alternatively apply it to the constructor argument type:

template<typename T>
class vector<T> {
public:
vector(size_t n, const T& val); //
n elements of type T with value val

template<typename Iter>
vector(Enable_if<Input_iterator<Iter>(),Iter>> b, Iter e); //
initialize from [b:e)
// ...
};

The Enable_if techniques work for template functions (including member functions of class templates and specializations) only. The implementation and use of Enable_if rely on a detail in the rules for overloading function templates (§23.5.3.2). Consequently, it cannot be used to control declarations of classes, variables, or non-template functions. For example:

Enable_if<(version2_2_3<config),M_struct>* make_default() // error: not a template
{
return new Mystruct{};
}


template<typename T>
void f(const T& x)
{

Enable_if<(20<sizeof<T>),T> tmp = x; // error: tmp is not a function
Enable_if<!(20<sizeof<T>),T&> tmp = *new T{x}; // error: tmp is not a function
// ...
}

For tmp, using Holder28.2) would almost certainly be cleaner anyway: if you had managed to construct that free-store object, how would you delete it?

28.4.2. Implementing Enable_if

The implementation of Enable_if is almost trivial:

template<bool B, typename T = void>
struct std::enable_if {
typedef T type;
};


template<typename T>
struct std::enable_if<false, T> {}; //
no ::type if B==false

template<bool B, typename T = void>
using Enable_if = typename std::enable_if<B,T>::type;

Note that we can leave out the type argument and get void by default.

For a language-technical explanation of how these simple declarations become useful as a fundamental construct see §23.5.3.2.

28.4.3. Enable_if and Concepts

We can use Enable_if for a wide variety of predicates, including many tests of type properties (§28.3.1.1). Concepts are among the most general and useful predicates we have. Ideally, we would like to overload based on concepts, but lacking language support for concepts, the best we can do is to use Enable_if to select based on constraints. For example:

template<typename T>
Enable_if<Ordered<T>()> fct(T* ,T* ); // optimized implementation

template<typename T>
Enable_if<!Ordered<T>()> fct(T*,T*); // nonoptimized implementation

Note that Enable_if defaults to void, so fct() is a void function. I’m not sure that using that default increases readability, but we can use fct() like this:

void f(vector<int>& vi, vector<complex<int>>& vc)
{
if (vi.size()==0 || vc.size()==0) throw runtime_error("bad fct arg");
fct(&vi.front(),&vi.back()); //
call optimized
fct(&vc.front(),&vc.back()); // call nonoptimized
}

These calls are resolved as described because we can use < for an int but not for a complex<int>. Enable_if resolves to void if we don’t provide a type argument.

28.4.4. More Enable_if Examples

When using Enable_if, sooner or later we want to ask if a class has a member with a specific name and an appropriate type. For many of the standard operations, such as constructors and assignments, the standard library provides a type property predicate, such as is_copy_assignable andis_default_constructible35.4.1). However, we can build our own predicates. Consider the question “Can we call f(x) if x is of type X?” Defining has_f to answer that question gives an opportunity to demonstrate some of the techniques used and some of the scaffolding/boilerplate code provided internally in many template metaprogramming libraries (including parts of the standard library). First, define the usual class plus specialization to represent an alternative:

struct substitution_failure { }; // represent a failure to declare something

template<typename T>
struct substitution_succeeded : std::true_type
{ };


template<>
struct substitution_succeeded<substitution_failure> : std::false_type
{ };

Here, substitution_failure is used to represent a substitution failure (§23.5.3.2). We derive from std::true_type unless the argument type is substitution_failure. Obviously, std::true_type and std::false_type are types that represent the values true and false, respectively:

std::true_type::value == true
std::false_type::value == false

We use substitution_succeeded to define the type functions we really want. For example, we might be looking for a function f that we can call as f(x). For that, we can define has_f:

template<typename T>
struct has_f

: substitution_succeeded<typename get_f_result<T>::type>
{};

So, if get_f_result<T> yields a proper type (presumably the return type of a call of f), has_f::value is true_type::value, which is true. If get_f_result<T> doesn’t compile, it returns substitution_failure and has_f::value is false.

So far, so good, but how do we get get_f_result<T> to be substitution_failure if somehow f(x) doesn’t compile for a value x of type X? The definition that achieves that looks innocent enough:

template<typename T>
struct get_f_result {
private:
template<typename X>
static auto check(X const& x) –> decltype(f(x)); //
can call f(x)
static substitution_failure check(...); // cannot call f(x)
public:
using type = decltype(check(std::declval<T>()));
};

We simply declare a function check so that check(x) has the same return type as f(x). Obviously, that won’t compile unless we can call f(x). So, that declaration of check fails if we can’t call f(x). In that case, because substitution failure is not an error (SFINAE; §23.5.3.2), we get the second definition of check(), which has substitution_failure as its return type. And, yes, this elaborate piece of trickery fails if our function f was declared to return a substitution_failure.

Note that decltype() does not evaluate its operand.

We managed to turn what looked like a type error into the value false. It would have been simpler if the language had provided a primitive (built-in) operation for doing that conversion; for example:

is_valid()); // can f(x) be compiled?

However, a language cannot provide everything as a language primitive. Given the scaffolding code, we just have to provide conventional syntax:

template<typename T>
constexpr bool Has_f()
{
return has_f<T>::value;
}

Now, we can write:

template<typename T>
class X {

// ...
Enable_if<Has_f<T>()> use_f(const T&)
{

// ...
f(t);
//
...
}
//
...
};

X<T> has a member use_f() if and only if f(t) can be called for a T value t.

Note that we cannot simply write:

if (Has_f<decltype(t)>()) f(t);

The call f(t) will be type checked (and fail type checking) even if Has_f<decltype(t)>() returns false.

Given the technique used to define Has_f, we can define Has_foo for any operation or member foo we can think of. The scaffolding is 14 lines of code for each foo. This can get repetitive but is not difficult.

This implies that Enable_if<> allows us to choose among overloaded templates based on just about any logical criteria for argument types. For example, we can define a Has_not_equals() type function to check if != is available and use it like this:

template<typename Iter, typename Val>
Enable_if<Has_not_equals<Iter>(),Iter> find(Iter first, Iter last, Val v)
{
while (first!=last && !(*first==v))

++first;
return first;
}

template<typename Iter, typename Val>
Enable_if<!Has_not_equals<Iter>(),Iter> find(Iter first, Iter last, Val v)
{
while (!(first==last) && !(*first==v))

++first;
return first;
}

Such ad hoc overloading easily gets messy and unmanageable. For example, try adding versions that use != for the value comparison (that is, *first!=v, rather than !(*first==v)), when possible. Consequently, I recommend relying on the more structured standard overloading rules (§12.3.1) and specialization rules (§25.3) when there is a choice. For example:

template<typename T>
auto operator!=(const T& a, const T& b) –> decltype(!(a==b))
{
return !(a==b);
}

The rules ensure that if a specific != has already been defined for a type T (as a template or as a non-template function), this definition will not be instantiated. I use decltype() partly to show how in general to derive the return type from a previously defined operator, and partly to handle the rare cases where != returns something different from bool.

Similarly, we can conditionally define >, <=, >=, etc., given a <.

28.5. A Compile-Time List: Tuple

Here, I’ll demonstrate the basic template metaprogramming techniques in a single simple, but realistic, example. I will define a Tuple with an associated access operation and an output operation. Tuples defined like this have been used industrially for more than a decade. The more elegant and more general std::tuple is presented in §28.6.4 and §34.2.4.2.

The idea is to allow code like this:

Tuple<double, int, char> x {1.1, 42, 'a'};
cout << x << "\n";
cout << get<1>(x) << "\n";

The resulting output is:

{ 1.1, 42, 'a'};
42

The definition of Tuple is fundamentally simple:

template<typename T1=Nil, typename T2=Nil, typename T3=Nil, typename T4=Nil>
struct Tuple : Tuple<T2, T3, T4> { //
layout: {T2,T3,T4} before T1
T1 x;

using Base = Tuple<T2, T3, T4>;
Base* base() { return static_cast<Base*>(this); }
const Base* base() const { return static_cast<const Base*>(this); }


Tuple(const T1& t1, const T2& t2, const T3& t3, const T4& t4) :Base{t2,t3,t4}, x{t1} { }
};

So, a Tuple of four elements (often referred to as a 4-tuple) is a Tuple of three elements (a 3-tuple) followed by a fourth element.

We construct a Tuple of four elements with a constructor that takes four values (potentially of four different types). It uses its last three elements (its tail) to initialize its base 3-tuple and the first (its head) to initialize its member x.

Manipulation of the tail of a Tuple – that is of the base class of the Tuple – is important and common in the implementation of the Tuple. Consequently, I provided an alias Base and a pair of member functions base() to simplify manipulation of the base/tail.

Obviously, this definition handles only tuples that really have four elements. Furthermore, it leaves much of the work to the 3-tuple. Tuples with fewer than four elements are defined as specializations:

template<>
struct Tuple<> { Tuple() {} }; //
0-tuple

template<typename T1>
struct Tuple<T1> : Tuple<> { //
1-tuple
T1 x;

using Base = Tuple<>;
Base *base() { return static_cast<Base*>(this); }
const Base
*base() const { return static_cast<const Base*>(this); }

Tuple(const T1& t1) :Base{}, x{t1} { }
};


template<typename T1, typename T2>
struct Tuple<T1, T2> : Tuple<T2> { //
2-tuple, layout: T2 before T1
T1 x;

using Base = Tuple<T2>;
Base* base() { return static_cast<Base*>(this); }
const Base* base() const { return static_cast<const Base*>(this); }


Tuple(const T1& t1, const T2& t2) :Base{t2}, x{t1} { }
};


template<typename T1, typename T2, typename T3>
struct Tuple<T1, T2, T3> : Tuple<T2, T3> { //
3-tuple, layout: {T2,T3} before T1
T1 x;
using Base = Tuple<T2, T3>;
Base* base() { return static_cast<Base*>(this); }
const Base* base() const { return static_cast<const Base*
>(this); }

Tuple(const T1& t1, const T2& t2, const T3& t3) :Base{t2, t3}, x{t1} { }
};

These declarations are rather repetitive and follow the simple pattern of the first Tuple (the 4-tuple). That definition of a 4-tuple, Tuple, is the primary template and provides the interface to Tuples of all sizes (0, 1, 2, 3, and 4). That is why I had to provide those Nil default template arguments. In fact, they will never be used. Specialization will choose one of the simpler Tuples rather than use Nil.

The way I defined Tuple as a “stack” of derived classes is fairly conventional (e.g., std::tuple is defined similarly; §28.5). It has the curious effect that the first element of a Tuple will (given the usual implementation techniques) get the highest address and that the last element will have the same address as the whole Tuple. For example:

tuple<double,string,int,char>{3.14,string{"Bob"},127,'c'}

can be graphically represented like this:

Image

This opens some interesting optimization possibilities. Consider:

class FO { /* function object with no data members */ };

typedef Tuple<int* , int *> T0;
typedef Tuple<int*
,FO> T1;
typedef Tuple<int*
, FO, FO> T2;

On my implementation, I got sizeof(T0)==8, sizeof(T1)==4, and sizeof(T2)==4 as the compiler optimizes away the empty base classes. This is called the empty-base optimization and is guaranteed by the language (§27.4.1).

28.5.1. A Simple Output Function

The definition of Tuple has a nice regular, recursive structure that we can use to define a function for displaying the list of elements. For example:

template<typename T1, typename T2, typename T3, typename T4>
void print_elements(ostream& os, const Tuple<T1,T2,T3,T4>& t)
{
os << t.x << ", "; //
t's x
print_elements(os,*t.base());
}


template<typename T1, typename T2, typename T3>
void print_elements(ostream& os, const Tuple<T1,T2,T3>& t)
{
os << t.x << ", ";
print_elements(os,*t.base());
}


template<typename T1, typename T2>
void print_elements(ostream& os, const Tuple<T1,T2>& t)
{
os << t.x << ", ";
print_elements(os,*t.base());
}


template<typename T1>
void print_elements(ostream& os, const Tuple<T1>& t)
{
os << t.x;
}


template<>
void print_elements(ostream& os, const Tuple<>& t)
{
os << " ";
}

The similarity of the print_elements() for the 4-tuple, 3-tuple, and 2-tuple hints at a better solution (§28.6.4), but for now I’ll just use these print_elements() to define a << for Tuples:

template<typename T1, typename T2, typename T3, typename T4>
ostream& operator<<(ostream& os, const Tuple<T1,T2,T3,T4>& t)
{
os << "{ ";
print_elements(os,t);
os << " }";
return os;
}

We can now write:

Tuple<double, int, char> x {1.1, 42, 'a'};
cout << x << "\n";


cout << Tuple<double,int,int,int>{1.2,3,5,7} << "\n";
cout << Tuple<double,int,int>{1.2,3,5} << "\n";
cout << Tuple<double,int>{1.2,3} << "\n";
cout << Tuple<double>{1.2} << "\n";
cout << Tuple<>{} << "\n";

Unsurprisingly, the output is:

{ 1.1 42, a }
{ 1.2,3,5,7 }
{ 1.2,3,5 }
{ 1.2,3 }
{ 1.2 }
{ }

28.5.2. Element Access

As defined, Tuple has a variable number of elements of potentially differing types. We would like to access those elements efficiently and without the possibility of type system violations (i.e., without using casts). We can imagine a variety of schemes, such as naming the elements, numbering the elements, and accessing elements by recursing though the elements until we reach a desired element. The last alternative is what we will use to implement the most common access strategy: index the elements. In particular, I want to implement a way to subscript a tuple. Unfortunately, I am unable to implement an appropriate operator[], so I use a function template get():

Tuple<double, int, char> x {1.1, 42, 'a'};

cout << "{ "
<< get<0>(x) << ", "
<< get<1>(x) << ", "
<< get<2>(x) << " }\n"; //
write { 1.1, 42, a }

auto xx = get<0>(x); // xx is a double

The idea is to index the elements, starting from 0, in such a way that the element selection is done at compile time and we preserve all type information.

The get() function constructs an object of type getNth<T,int>. The job of getNth<X,N> is to return a reference to the Nth element, which is assumed to have type X. Given such a helper, we can define get():

template<int N, typename T1, typename T2, typename T3, typename T4>
Select<N, T1, T2, T3, T4>& get(Tuple<T1, T2, T3, T4>& t)
{
return getNth<Select<N, T1, T2, T3, T4>,N>::get(t);
}

The definition of getNth is a variant of the usual recursion from N down to the specialization for 0:

template<typename Ret, int N>
struct getNth { //
getNth() remembers the type (Ret) of the Nth element
template<typename T>
static Ret& get(T& t) //
get the value element N from t's Base
{
return getNth<Ret,N–1>::get(*t.base());
}
};

template<typename Ret>
struct getNth<Ret,0> {
template<typename T>
static Ret& get(T& t)
{
return t.x;
}
};

Basically, getNth is a special-purpose for-loop, implemented by recursing N–1 times. The member functions are static because we don’t really want any objects of class getNth. That class is only used as a place to hold Ret and N in a way that allows the compiler to use them.

This is quite a bit of scaffolding to index into a Tuple, but at least the resulting code is type-safe and efficient. By “efficient,” I mean that given a reasonably good compiler (as is common), there is no run-time overhead for accessing a Tuple member.

Why must we write get<2>(x) rather than just x[2]? We could try:

template<typename T>
constexpr auto operator[](T t,int N)
{
return get<N>(t);
}

Unfortunately, this does not work:

operator[]() must be a member, but we could handle that by defining it within Tuple.

• Inside operator[](), the argument N is not known to be a constant expression.

• I “forgot” that only lambdas can deduce their result type from their return-statement (§11.4.4), but that could be handled by adding a –>decltype(get<N>(t)).

To get that, we need some language lawyering and for now, we have to make do with get<2>(x).

28.5.2.1. const Tuples

As defined, get() works for non-const Tuple elements and can be used on the left-hand side of assignments. For example:

Tuple<double, int, char> x {1.1, 42, 'a'};

get<2>(x) = 'b'; // OK

However, it can’t be used for consts:

const Tuple<double, int, char> xx {1.1, 42, 'a'};

get<2>(xx) = 'b'; // error: xx is const
char cc = get<2>(xx); // error: xx is const (surprise?)

The problem is that get() takes its argument by non-const reference. But xx is a const, so it is not an acceptable argument.

Naturally, we also want to be able to have const Tuples. For example:

const Tuple<double, int, char> xx {1.1, 422, 'a'};
char cc = get<2>(xx); //
OK: reading from const
cout << "xx: " << xx << "\n";
get<2>(xx) = 'x'; //
error: xx is const

To handle const Tuples, we have to add const versions of get() and getNth’s get(). For example:

template<typename Ret, int N>
struct getNth { //
getNth() remembers the type (Ret) of the Nth element
template<typename T>
static Ret& get(T& t) //
get the value element N from t's Base
{
return getNth<Ret,N–1>::get(*t.base());
}


template<typename T>
static const Ret& get(const T& t) //
get the value element N from t's Base
{
return getNth<Ret,N–1>::get(*t.base());
}
};


template<typename Ret>
struct getNth<Ret,0> {
template<typename T> static Ret& get(T& t) { return t.x; }
template<typename T> static const Ret& get(const T& t) { return t.x; }
};


template<int N, typename T1, typename T2, typename T3, typename T4>
Select<N, T1, T2, T3, T4>& get(Tuple<T1, T2, T3, T4>& t)
{
return getNth<Select<N, T1, T2, T3, T4>,N>::get(t);
}


template<int N, typename T1, typename T2, typename T3>
const Select<N, T1, T2, T3>& get(const Tuple<T1, T2, T3>& t)
{
return getNth<Select<N, T1, T2, T3>,N>::get(t);
}

Now, we can handle both const and non-const arguments.

28.5.3. make_tuple

A class template cannot deduce its template arguments, but a function template can deduce them from its function arguments. This implies that we can make a Tuple type implicit in code by having a function construct it for us:

template<typename T1, typename T2, typename T3, typename T4>
Tuple<T1, T2, T3, T4> make_tuple(const T1& t1, const T2& t2, const T3& t3, const T4& t4)
{
return Tuple<T1, T2, T3, T4>{t1, t2, t3,t4};
}


// ... and the other four make_Tuples ...

Given make_tuple(), we can write:

auto xxx = make_Tuple(1.2,3,'x',1223);
cout << "xxx: " << xxx << "\n";

Other useful functions, such as head() and tail(), are easily implemented. The standard-library tuple provides a few such utility functions (§28.6.4).

28.6. Variadic Templates

Having to deal with an unknown number of elements is a common problem. For example, an error-reporting function may take between zero and ten arguments, a matrix may have between one and ten dimensions, and a tuple can have zero to ten elements. Note that in the first and the last example, the elements may not necessarily be of the same type. In most cases, we would prefer not to deal with each case separately. Ideally, a single piece of code should handle the cases for one element, two elements, three elements, etc. Also, I pulled the number ten out of a hat: ideally, there should be no fixed upper limit on the number of elements.

Over the years, many solutions have been found. For example, default arguments (§12.2.5) can be used to allow a single function to accept a variable number of arguments, and function overloading (§12.3) can be used to provide a function for each number of arguments. Passing a single list of elements (§11.3) can be an alternative to having a variable number of arguments as long as the elements are all of the same type. However, to elegantly handle the case of an unknown number of arguments of unknown (and possibly differing) types, some additional language support is needed. That language feature is called a variadic template.

28.6.1. A Type-Safe printf()

Consider the archetypical example of a function needing an unknown number of arguments of a variety of types: printf(). As provided by the C and C++ standard libraries, printf() is flexible and performs nicely (§43.3). However, it is not extensible to user-defined types and not type-safe, and it is a popular target for hackers.

The first argument to printf() is a C-style string interpreted as a “format string.” Additional arguments are used as required by the format string. Format specifiers, such as %g for floating-point and %s for zero-terminated arrays of characters, control the interpretation of the additional arguments. For example:

printf("The value of %s is %g\n","x",3.14);
string name = "target";
printf("The value of %s is %P\n",name,Point{34,200});


printf("The value of %s is %g\n",7);

The first call of printf() works as intended, but the second call has two problems: the format specification %s refers to C-style strings, and printf() will not interpret the std::string argument correctly. Furthermore, there is no %P format and in general no direct way of printing values of user-defined types, such as Point. In the third call of printf(), I provided an int as the argument for %s and I “forgot” to provide an argument for %g. In general, a compiler is not able to compare the number and types of arguments required by the format string with the number and types of arguments provided by the programmer. The output of that last call (if any) would not be pretty.

Using variadic templates, we can implement an extensible and type-safe variant of printf(). As is common for compile-time programming, the implementation has two parts:

[1] Handle the case where there is just one argument (the format string).

[2] Handle the case where there is at least one “additional” argument that, suitably formatted, needs to output at an appropriate point indicated by the format string.

The simplest case is the one with only one argument, the format string:

void printf(const char* s)
{
if (s==nullptr) return;


while (*s) {
if (*s=='%' &&
*++s!='%') // make sure no more arguments are expected
// %% represents plain % in a format string
throw runtime_error("invalid format: missing arguments");
std::cout <<
*s++;
}
}

That prints out the format string. If a format specifier is found, this printf() throws an exception because there is no argument to be formatted. A format specifier is defined to be a % not followed by another % (%% is printf()’s notation for a % that does not start a type specifier). Note that*++s does not overflow even if a % is the last character in a string. In that case, *++s refers to the terminating zero.

That done, we must handle printf() with more arguments. Here is where a template, and in particular a variadic template, comes into play:

template<typename T, typename... Args> // variadic template argument list: one or more arguments
void printf(const char* s, T value, Args... args) // function argument list: two or more arguments

{
while (s &&* s) {
if (*s=='%' &&*++s!='%') { //
a format specifier (ignore which one it is)
std::cout << value; // use first non-format argument
return printf(++s, args...); // do a recursive call with the tail of the argument list
}
std::cout << *s++;
}
throw std::runtime_error("extra arguments provided to printf");
}

This printf() finds and prints the first non-format argument, “peels off” that argument, and then calls itself recursively. When there are no more non-format arguments, it calls the first (simpler) printf(). Ordinary characters (i.e., not % formal specifiers) are simply printed.

The overloading of << replaces the use of the (possibly erroneous) “hint” in the format specifier. If an argument has a type for which << is defined, that argument is printed; otherwise, that call does not type check and the program will never run. A formatting character after a % is not used. I can imagine type-safe uses for such characters, but the purpose of this example is not to design the perfect printf() but to explain variadic templates.

The Args... defines what is called a parameter pack. A parameter pack is a sequence of (type/value) pairs from which you can “peel off” arguments starting with the first. When printf() is called with two or more arguments

void printf(const char *s, T value, Args... args);

is chosen, with the first argument as s, the second as value, and the rest (if any) bundled into the parameter pack args for later use. In the call printf(++s,args...) the parameter pack args is expanded so that the first element of args is selected as value and args is one element shorter than in the previous call. This carries on until args is empty, so that we call:

void printf(const char*);

If we really wanted to, we could check printf() format directives, such as %s. For example:

template<typename T, typename... Args> // variadic template argument list: one or more arguments
void printf(const char* s, T value, Args... args) // function argument list: two or more arguments
{
while (s && *s) {
if (*s=='%') { //
a format specifier or %%
switch (*++s) {
case '%': //
not format specifier
break;
case 's':
if (!Is_C_style_string<T>() && !Is_string<T>())
throw runtime_error("Bad printf() format");
break;
case 'd':
if (!Is_integral<T>()) throw runtime_error("Bad printf() format");
break;
case 'g':
if (!Is_floating_point<T>()) throw runtime_error("Bad printf() format");
break;
}

std::cout << value; // use first non-format argument
return printf(++s, args...); // do a recursive call with the tail of the argument list
}
std::cout <<*s++;

}
throw std::runtime_error("extra arguments provided to printf");
}

The standard library provides std::is_integral and std::is_floating_point, but you’d have to craft Is_C_style_string yourself.

28.6.2. Technical Details

If you are familiar with functional programming, you should find the printf() example (§28.6) an unusual notation for a pretty standard technique. If not, here are minimal technical examples that might help. First, we can declare and use a simple variadic template function:

template<typename... Types>
void f(Types... args); //
variadic template function

That is, f() is a function that can be called with an arbitrary number of arguments of arbitrary types:

f(); // OK: args contains no arguments
f(1); // OK: args contains one argument: int
f(2, 1.0); // OK: args contains two arguments: int and double
f(2, 1.0, "Hello"); // OK: args contains three arguments: int, double, and const char*

A variadic template is defined with the ... notation:

template<typename... Types>
void f(Types... args); //
variadic template function

The typename... in the declaration of Types specifies that Types is a template parameter pack. The ... in the type of args specifies that args is a function parameter pack. The type of each args function argument is the corresponding Types template argument. We can use class... with the same meaning as typename.... The ellipsis (...) is a separate lexical token, so you can place whitespace before or after it. That ellipsis can appear in many different places in the grammar, but it always means “zero or more occurrences of something”). Think of a parameter pack as a sequence of values for which the compiler has remembered the types. For example, we could graphically represent a parameter pack for {'c',127,string{"Bob"},3.14}:

Image

This is typically called a tuple. The memory layout is not specified by the C++ standard. For example, it might be the reverse of what is shown here (last element at the lowest memory address; §28.5). However, it is a dense, not a linked, representation. To get to a value, we need to start fromthe beginning and work our way through to what we want. The implementation of Tuple demonstrates that technique (§28.5). We can find the type of the first element and access it using that, then we can (recursively) proceed to the next argument. If we want to, we can give the appearance of indexed access using something like get<N> for Tuple (and for std::tuple; §28.6.4), but unfortunately there is no direct language support for that.

If you have a parameter pack, you can expand it into its sequence of elements by placing a ... after it. For example:

template<typename T, typename... Args>
void printf(const char*
s, T value, Args... args)
{

// ...
return printf(++s, args...); // do a recursive call with the elements of args as arguments
// ...
}

Expansion of a parameter pack into its elements is not restricted to function calls. For example:

template<typename... Bases>
class X : public Bases... {
public:

X(const Bases&... b) : Bases(b)... { }
};


X<> x0;
X<Bx> x1(1);
X<Bx,By> x2(2,3);
X<Bx,By,Bz> x3(2,3,4);

Here, Bases... says that X has zero or more bases. When it comes to initializing an X, the constructor requires zero or more values of types specified in the Bases variadic template argument. One by one those values are passed to the corresponding base initializer.

We can use the ellipsis to mean “zero or more elements of something” in most places where a list of elements is required (§iso.14.5.3), such as in:

• A template argument list

• A function argument list

• An initializer list

• A base specifier list

• A base or member initializer list

• A sizeof... expression

A sizeof... expression is used to obtain the number of elements in a parameter pack. For example, we can define a constructor for a tuple given a pair provided the number of tuple elements is two:

template<typename... Types>
class tuple {

// ...
template<typename T, typename U, typename = Enable_if<sizeof...(Types)==2>
tuple(const pair<T,U>>&);
};

28.6.3. Forwarding

One of the major uses of variadic templates is forwarding from one function to another. Consider how to write a function that takes as arguments something to be called and a possibly empty list of arguments to give to the “something” as arguments:

template<typename F, typename... T>
void call(F&& f, T&&... t)
{
f(forward<T>(t)...);
}

That is pretty simple and not a hypothetical example. The standard-library thread has constructors using this technique (§5.3.1, §42.2.2). I use pass-by-rvalue-reference of a deduced template argument type to be able to correctly distinguish between rvalues and lvalues (§23.5.2.1) andstd::forward() to take advantage of that (§35.5.1). The ... in T&&... is read as “accept zero or more && arguments, each of the type of the corresponding T.” The ... in forward<T>(t)... is read “forward the zero or more arguments from t.”

I used a template argument for the type of the “something” to be called, so that call() can accept functions, pointers to functions, function objects, and lambdas.

We can test call():

void g0()
{
cout << "g0()\n";
}


template<typename T>
void g1(const T& t)
{
cout << "g1(): " << t << '\n';
}


void g1d(double t)
{
cout << "g1d(): " << t << '\n'; }


template<typename T, typename T2>
void g2(const T& t, T2&& t2)
{
cout << "g2(): " << t << ' ' << t2 << '\n';
}


void test()
{
call(g0);
call(g1); //
error: too few arguments
call(g1<int>,1);
call(g1<const char*>,"hello");
call(g1<double>,1.2);
call(g1d,1.2);
call(g1d,"No way!"); //
error: wrong argument type for g1d()
call(g1d,1.2,"I can't count"); // error: too many arguments for g1d()
call(g2<double,string>,1,"world!");

int i = 99; // testing with lvalues
const char* p = "Trying";
call(g2<double,string>,i,p);


call([](){ cout <<"l1()\n"; });
call([](int i){ cout <<"l0(): " << i << "\n";},17);
call([i](){ cout <<"l1(): " << i << "\n"; });
}

I have to be specific about which specialization of a template function to pass because call() cannot deduce which one to use from the types of the other arguments.

28.6.4. The Standard-Library tuple

The simple Tuple in §28.5 has an obvious weakness: it can handle at most four elements. This section presents the definition of the standard-library tuple (from <tuple>; §34.2.4.2) and explains the techniques used to implement it. The key difference between std::tuple and our simple Tupleis that the former uses variadic templates to remove the limitation on the number of elements. Here are the key definitions:

template<typename Head, typename... Tail>
class tuple<Head, Tail...>

: private tuple<Tail...> { // here is the recursion
/*
Basically, a tuple stores its head (first (type,value) pairs)
and derives from the tuple of its tail (the rest of the (type/value) pairs).
Note that the type is encoded in the type, not stored as data

*/
typedef tuple<Tail...> inherited;
public:
constexpr tuple() { } //
default: the empty tuple

// Construct tuple from separate arguments:
tuple(Add_const_reference<Head> v, Add_const_reference<Tail>... vtail)
: m_head(v), inherited(vtail...) { }

// Construct tuple from another tuple:
template<typename... VValues>
tuple(const tuple<VValues...>& other)

: m_head(other.head()), inherited(other.tail()) { }
template<typename... VValues>
tuple& operator=(const tuple<VValues...>& other) //
assignment
{
m_head = other.head();
tail() = other.tail();
return *this;
}
//
...

protected:
Head m_head;
private:

Add_reference<Head> head() { return m_head; }
Add_const_reference<const Head> head() const { return m_head; }


inherited& tail() { return *this; }
const inherited& tail() const { return *this; }
};

There is no guarantee that std::tuple is implemented as hinted here. In fact, several popular implementations derive from a helper class (also a variadic class template), so as to get the element layout in memory to be the same as a struct with the same member types.

The “add reference” type functions add a reference to a type if it isn’t a reference already. They are used to avoid copying (§35.4.1).

Curiously, std::tuple does not provide head() and tail() functions, so I made them private. In fact, tuple does not provide any member functions for accessing an element. If you want to access an element of a tuple, you must (directly or indirectly) call a function that splits it into a value and .... If I want head() and tail() for the standard-library tuple, I can write them:

template<typename Head, typename... Tail>
Head head(tuple<Head,Tail...>& t)
{
return std::get<0>(t); //
get first element of t (§34.2.4.2)
}

template<typename Head, typename... Tail>
tuple<T&...> tail(tuple<Head, Tail...>& t)
{
return /*
details */;
}

The “details” of the definition of tail() are ugly and complicated. If the designers of tuple had meant for us to use tail() on a tuple, they would have provided it as a member.

Given tuple, we can make tuples and copy and manipulate them:

tuple<string,vector,double> tt("hello",{1,2,3,4},1.2);
string h = head(tt.head); //
"hello"
tuple<vector<int>,double> t2 = tail(tt.tail); // {{1,2,3,4},1.2};

It can get tedious to mention all of those types. Instead, we can deduce them from argument types, for example, using the standard-library make_tuple():

template<typename... Types>
tuple<Types...> make_tuple(Types&&... t) //
simplified (§iso.20.4.2.4)
{
return tuple<Types...>(t...);
}

string s = "Hello";
vector<int> v = {1,22,3,4,5};
auto x = make_tuple(s,v,1.2);

The standard-library tuple has many more members than listed in the implementation above (hence the // ...). In addition, the standard provides several helper functions. For example, get() is provided for element access (like get() from §28.5.2), so we can write:

auto t = make_tuple("Hello tuple", 43, 3.15);
double d = get<2>(t); //
d becomes 3.15

So std::get() provides compile-time zero-based subscripting of std::tuples.

Every member of std::tuple is useful to someone and most are useful to many, but none adds to our understanding of variadic templates, so I do not go into details. There are constructors and assignments from the same type (copy and move), from other tuple types (copy and move), and from pairs (copy and move). The operations taking a std::pair argument use sizeof...28.6.2) to ensure that their target tuples have exactly two elements. There are (nine) constructors and assignments taking allocators (§34.4) and a swap()35.5.2).

Unfortunately, the standard library does not offer << or >> for tuple. Worse, writing a << for std::tuple is amazingly complicated because there is no simple and general way of iterating through the elements of a standard-library tuple. First we need a helper; it is a struct with two print()functions. One print() recurses through a list printing elements, and the other stops the recursion when there is no more elements to print:

template<size_t N> // print element N and following elements
struct print_tuple {
template<typename... T>
typename enable_if<(N<sizeof...(T))>::type
print(ostream& os, const tuple<T...>& t) const //
nonempty tuple
{
os << ", " << get<N>(t); //
print an element
print_tuple<N+1>()(os,t); // print the rest of the elements
}

template<typename... T>
typename enable_if<!(N<sizeof...(T))>::type //
empty tuple
print(ostream&, const tuple<T...>&) const
{
}
};

The pattern is that of a recursive function with a terminating overload (like printf() from §28.6.1).

However, note how it wastefully lets get<N>() count from 0 to N.

We can now write a << for tuple:

std::ostream& operator << (ostream& os, const tuple<>&) // the empty tuple
{
return os << "{}";
}


template<typename T0, typename... T>
ostream& operator<<(ostream& os, const tuple<T0, T...>& t) //
a nonempty tuple
{
os << '{' << std::get<0>(t); //
print first element
print_tuple<1>::print(os,t); // print the rest of the elements
return os << '}';
}

We can now print a tuple:

void user()
{
cout << make_tuple() << '\n';
cout << make_tuple("One meatball!") << '\n';
cout << make_tuple(1,1.2,"Tail!") << '\n';
}

28.7. SI Units Example

Using constexpr and templates, we can compute just about anything at compile time. Providing input for such computations can be tricky, but we can always #include data into the program text. However, I prefer simpler examples that in my opinion stand a better chance when it comes to maintenance. Here, I will show an example that provides a reasonable tradeoff between implementation complexity and utility. The compilation overhead is minimal and there is no run-time overhead. The example is to provide a small library for computations using units, such as meters, kilograms, and seconds. These MKS units are a subset of the international standard (SI) units used universally in science. The example is chosen to show how the simplest metaprogramming techniques can be used in combination with other language features and techniques.

We want to attach units to our values, so as to avoid meaningless computations. For example:

auto distance = 10_m; // 10 meters
auto time = 20_s; // 20 seconds
auto speed = distance/time; // .5 m/s (meters per second)

if (speed == 20) // error: 20 is dimensionless
// ...
if (speed == distance) // error: can't compare m to m/s
// ...
if (speed == 10_m/20_s) // OK: the units match
// ...
Quantity<MpS2> acceleration = distance/square(time); // MpS2 means m/(s*s)

cout << "speed==" << speed << " acceleration==" << acceleration << "\n";

Units provide a type system for physical values. As shown, we can use auto to hide types when we want to (§2.2.2), user-defined literals to introduce typed values (§19.2.6), and a type Quantity for use when we want to be explicit about Units. A Quantity is a numeric value with a Unit.

28.7.1. Units

First, I will define Unit:

template<int M, int K, int S>
struct Unit {
enum { m=M, kg=K, s=S };
};

A Unit has components representing the three units of measurement that we are interested in:

• Meters for length

• Kilograms for mass

• Seconds for time

Note that the unit values are encoded in the type. A Unit is meant for compile-time use.

We can provide more conventional notation for the most common units:

using M = Unit<1,0,0>; // meters
using Kg = Unit<0,1,0>; // kilograms
using S = Unit<0,0,1>; // seconds
using MpS = Unit<1,0,–1>; // meters per second (m/s)
using MpS2 = Unit<1,0,–2>; // meters per square second (m/(s*s))

Negative unit values indicate division by a quantity with that unit. This three-value representation of a unit is very flexible. We can represent the proper unit of any computation involving distance, mass, and time. I doubt we will find much use for Quantity<123,–15,1024>, that is, 123distances multiplied, divided by 15 masses multiplied, and then multiplied by 1024 time measurements multiplied – but it is nice to know that the system is general. Unit<0,0,0> indicates a dimensionless entity, a value without a unit.

When we multiply two quantities, their units are added. Thus, addition of Units is useful:

template<typename U1, typename U2>
struct Uplus {
using type = Unit<U1::m+U2::m, U1::kg+U2::kg, U1::s+U2::s>;
};


template<typename U1, U2>
using Unit_plus = typename Uplus<U1,U2>::type;

Similarly, when we divide two quantities, their units are subtracted:

template<typename U1, typename U2>
struct Uminus {
using type = Unit<U1::m–U2::m, U1::kg–U2::kg, U1::s–U2::s>;
};


template<typename U1, U2>
using Unit_minus = typename Uminus<U1,U2>::type;

Unit_plus and Unit_minus are simple type functions (§28.2) on Units.

28.7.2. Quantitys

A Quantity is a value with an associated Unit:

template<typename U>
struct Quantity {
double val;
explicit Quantity(double d) : val{d} {}
};

A further refinement would have made the type used to represent the value a template parameter, possibly defaulted to double. We can define Quantitys with a variety of units:

Quantity<M> x {10.5}; // x is 10.5 meters
Quantity<S> y {2}; // y is 2 seconds

I made the Quantity constructor explicit to make it less likely to get implicit conversions from dimensionless entities, such as plain C++ floating-point literals:

Quantity<MpS> s = 7; // error: attempt to convert an int to meters/second

Quantity<M> comp(Quantity<M>);
//
...
Quantity<M> n = comp(7); // error: comp() requires a distance

Now we can start thinking about computations. What do we do to physical measurements? I’m not going to review a whole physics textbook, but certainly we need addition, subtraction, multiplication, and division. You can only add and subtract values with the same units:

template<typename U>
Quantity<U> operator+(Quantity<U> x, Quantity<U> y) // same dimension
{
return Quantity<U>{x.val+y.val};
}


template<typename U>
Quantity<U> operator–(Quantity<U> x, Quantity<U> y) // same dimension
{
return Quantity<U>{x.val–y.val};
}

Quantity’s constructor is explicit, so we have to convert the resulting double value back to Quantity.

Multiplication Quantitys require addition of their Units. Similarly, division of Quantitys subtraction of their Units. For example:

template<typename U1, typename U2>
Quantity<Unit_plus<U1,U2>> operator *(Quantity<U1> x, Quantity<U2> y)
{
return Quantity<Unit_plus<U1,U2>>{x.val
*y.val};
}


template<typename U1, typename U2>
Quantity<Unit_minus<U1,U2>> operator/(Quantity<U1> x, Quantity<U2> y)
{
return Quantity<Unit_minus<U1,U2>>{x.val/y.val};
}

Given these arithmetic operations, we can express most computations. However, we find that real-world computations contain a fair number of scaling operations, that is, multiplications and divisions by dimensionless values. We could use Quantity<Unit<0,0,0>> but that gets tedious:

Quantity<MpS> speed {10};
auto double_speed = Quantity<Unit<0,0,0>>{2}
*speed;

To eliminate that verbosity, we can either provide an implicit conversion from double to Quantity<Unit<0,0,0>> or add a couple of variants to the arithmetic operations. I chose the latter:

template<typename U>
Quantity<U> operator *(Quantity<U> x, double y)
{
return Quantity<U>{x.val
*y};
}


template<typename U>
Quantity<U> operator *(double x, Quantity<U> y)
{
return Quantity<U>{x
*y.val};
}

We can now write:

Quantity<MpS> speed {10};
auto double_speed = 2*speed;

The main reason I do not define an implicit conversion from double to Quantity<Unit<0,0,0>> is that we do not want that conversion for addition or subtraction:

Quantity<MpS> speed {10};
auto increased_speed = 2.3+speed; //
error: can't add a dimensionless scalar to a speed

It is nice to have the detailed requirement for the code precisely dictated by the application domain.

28.7.3. Unit Literals

Thanks to the type aliases for the most common units, we can now write:

auto distance = Quantity<M>{10}; // 10 meters
auto time = Quantity<S>{20}; // 20 seconds
auto speed = distance/time; // .5 m/s (meters per second)

That’s not bad, but it is still verbose compared to code that conventionally simply leaves the units in the heads of the programmers:

auto distance = 10.0; // 10 meters
double time = 20; // 20 seconds
auto speed = distance/time; // .5 m/s (meters per second)

We needed the .0 or the explicit double to ensure that the type is double (and get the correct result for the division).

The code generated for the two examples should be identical, and we can do better still notationally. We can introduce user-defined literals (UDLs; §19.2.6) for the Quantity types:

constexpr Quantity<M> operator"" _m(double d) { return Quantity<M>{d}; }
constexpr Quantity<Kg> operator"" _kg(double d) { return Quantity<Kg>{d}; }
constexpr Quantity<S> operator"" _s(double d) { return Quantity<S>{d}; }

That gives us the literals from our original example:

auto distance = 10_m; // 10 meters
auto time = 20_s; // 20 seconds
auto speed = distance/time; // .5 m/s (meters per second)

if (speed == 20) // error: 20 is dimensionless
// ...
if (speed == distance) // error: can't compare m to m/s
// ...
if (speed == 10_m/20_s) // OK: the units match

I defined * and / for combinations of Quantitys and dimensionless values, so we can scale the units using multiplication or division. However, we could also provide more of the conventional units as user-defined literals:

constexpr Quantity<M> operator"" _km(double d) { return 1000*d; }
constexpr Quantity<Kg> operator"" _g(double d) { return d/1000; }
constexpr Quantity<Kg> operator"" _mg(double d) { return d/10000000; } //
milligram
constexpr Quantity<S> operator"" _ms(double d) { return d/1000; } // milliseconds
constexpr Quantity<S> operator"" _us(double d) { return d/1000; } // microseconds
constexpr Quantity<S> operator"" _ns(double d) { return d/1000000000; } // nanoseconds
// ...

Obviously, this could really get out of control through overuse of nonstandard suffixes (e.g., us is suspect even though it is widely used because u looks a bit like a Greek µ).

I could have provided the various magnitudes as more types (as is done for std::ratio; §35.3) but thought it simpler to keep the Unit types simple and focused on doing their primary task well.

I use underscores in my units _s and _m so as not to get in the way of the standard library providing the shorter and nicer s and m suffixes.

28.7.4. Utility Functions

To finish the job (as defined by the initial example), we need the utility function square(), the equality operator, and the output operator. Defining square() is trivial:

template<typename U>
Quantity<Unit_plus<U,U>> square(Quantity<U> x)
{
return Quantity<Unit_plus<U,U>>(x.val*x.val);
}

That basically shows how to write arbitrary computational functions. I could have constructed the Unit right there in the return value definition, but using the existing type function was easier. Alternatively, we could easily have defined a type function Unit_double.

The == looks more or less like all ==s. It is defined for values of the same Units only:

template<typename U>
bool operator==(Quantity<U> x, Quantity<U> y)
{
return x.val==y.val;
}


template<typename U>
bool operator!=(Quantity<U> x, Quantity<U> y)
{
return x.val!=y.val;
}

Note that I pass Quantitys by value. At run time, they are represented as doubles.

The output functions just do conventional character manipulation:

string suffix(int u, const char* x) // helper function
{
string suf;
if (u) {
suf += x;
if (1<u) suf += '0'+u;


if (u<0) {
suf += '–';
suf += '0'–u;
}

}
return suf;
}

template<typename U>
ostream& operator<<(ostream& os, Quantity<U> v)
{
return os << v.val << suffix(U::m,"m") << suffix(U::kg,"kg") << suffix(U::s,"s");
}

Finally, we can write:

auto distance = 10_m; // 10 meters
auto time = 20_s; // 20 seconds
auto speed = distance/time; // .5 m/s (meters per second)

if (speed == 20) // error: 20 is dimensionless
// ...
if (speed == distance) // error: can't compare m to m/s
// ...
if (speed == 10_m/20_s) // OK: the units match
// ...

Quantity<MpS2> acceleration = distance/square(time); // MpS2 means m/(s*s)

cout << "speed==" << speed << " acceleration==" << acceleration << "\n";

Such code will, given a reasonable compiler, generate exactly the same code as would have been generated using doubles directly. However, it is “type checked” (at compile time) according to the rules for physical units. It is an example of how we can add a whole new set of application-specific types with their own checking rules to a C++ program.

28.8. Advice

[1] Use metaprogramming to improve type safety; §28.1.

[2] Use metaprogramming to improve performance by moving computation to compile time; §28.1.

[3] Avoid using metaprogramming to an extent where it significantly slows down compilation; §28.1.

[4] Think in terms of compile-time evaluation and type functions; §28.2.

[5] Use template aliases as the interfaces to type functions returning types; §28.2.1.

[6] Use constexpr functions as the interfaces to type functions returning (non-type) values; §28.2.2.

[7] Use traits to nonintrusively associate properties with types; §28.2.4.

[8] Use Conditional to choose between two types; §28.3.1.1.

[9] Use Select to choose among several alternative types; §28.3.1.3.

[10] Use recursion to express compile-time iteration; §28.3.2.

[11] Use metaprogramming for tasks that cannot be done well at run time; §28.3.3.

[12] Use Enable_if to selectively declare function templates; §28.4.

[13] Concepts are among the most useful predicates to use with Enable_if; §28.4.3.

[14] Use variadic templates when you need a function that takes a variable number of arguments of a variety of types; §28.6.

[15] Don’t use variadic templates for homogeneous argument lists (prefer initializer lists for that); §28.6.

[16] Use variadic templates and std::move() where forwarding is needed; §28.6.3.

[17] Use simple metaprogramming to implement efficient and elegant unit systems (for fine-grained type checking); §28.7.

[18] Use user-defined literals to simplify the use of units; §28.7.