Overview of the C++ Standard Library - Coding the Professional Way - Professional C++ (2014)

Professional C++ (2014)

Part IIICoding the Professional Way

Chapter 15Overview of the C++ Standard Library

WHAT’S IN THIS CHAPTER?

· The coding principles used throughout the standard library

· The kind of functionality the standard library provides

WROX.COM DOWNLOADS FOR THIS CHAPTER

Please note that all the code examples for this chapter are available as a part of this chapter’s code download on the book’s website at www.wrox.com/go/proc++3e on the Download Code tab.

The most important library that you will use as a C++ programmer is the C++ standard library. As its name implies, this library is part of the C++ standard, so any standards-conforming compiler should include it. The standard library is not monolithic: It includes several disparate components, some of which you have been using already. You may even have assumed they were part of the core language. All standard library classes and functions are declared in the std namespace.

The heart of the C++ standard library is its generic containers and algorithms. This subset of the library is often called the Standard Template Library, or STL for short, because of its abundant use of templates. The power of the STL is that it provides generic containers and generic algorithms in such a way that most of the algorithms work on most of the containers, no matter what type of data the containers store. Performance is a very important part of the STL. The goal is to make the STL containers and algorithms as fast as or faster than hand-written code.

The C++ standard library also includes all C headers that are part of the C99 standard, but with new names. For example, you can access the functionality from the C <stdio.h> header by including <cstdio>. However, the use of functionality provided by these C headers is discouraged in favor of true C++ functionality.

A C++ programmer who wishes to claim language expertise is expected to be familiar with the standard library. You can save yourself immeasurable time and energy by incorporating STL containers and algorithms into your programs instead of writing and debugging your own versions. Now is the time to master the standard library.

This first chapter on the standard library provides a general overview of the functionality available in the standard library and in the STL.

The next few chapters go into more detail on several aspects of the standard library and the STL, including containers, iterators, generic algorithms, predefined function object classes, regular expressions, additional library utilities, and customizing and extending the library.

Despite the depth of material found in this and the next chapters, the standard library is too large for this book to cover exhaustively. You should read this and the following chapters to learn about the standard library and the STL, but keep in mind that they don’t mention every method and data member that the various classes provide, or show you the prototypes of every algorithm. Appendix C provides a summary of all the header files in the standard library. Consult a Standard Library Reference, for examplehttp://www.cppreference.com/ or http://www.cplusplus.com/reference/, for a complete reference of all the provided functionality.

CODING PRINCIPLES

The standard library, and definitely the standard template library, make heavy use of the C++ features called templates and operator overloading.

Use of Templates

Templates are used to allow generic programming. They make it possible to write code that can work with all kinds of objects, even objects unknown to the programmer when writing the code. The obligation of the programmer writing the template code is to specify the requirements of the classes that define these objects; for example, that they have an operator for comparison, or a copy constructor, or whatever is deemed appropriate, and then making sure the code that is written uses only those required capabilities. The obligation of the programmer creating the objects is to supply those operators and methods that the template writer requires.

Unfortunately, many programmers consider templates to be the most difficult part of C++ and, for that reason, tend to avoid them. However, even if you never write your own templates, you need to understand their syntax and capabilities in order to use the STL. Templates are described in detail in Chapter 11. If you skipped that chapter and are not familiar with templates, I suggest you first read Chapter 11 and then come back to learn about the standard library.

Use of Operator Overloading

Operator overloading is another feature used extensively by the C++ standard library, including the STL. Chapter 8 has a whole section devoted to operator overloading. Make sure you read that section and understand it before tackling this and subsequent chapters. In addition, Chapter 14 presents much more detail on the subject of operator overloading, but those details are not required to understand the following chapters.

OVERVIEW OF THE C++ STANDARD LIBRARY

This section introduces the various components of the standard library from a design perspective. You will learn what facilities are available for you to use, but you will not learn the coding details. Those details are covered in other chapters.

Strings

C++ provides a built-in string class, defined in the <string> header. Although you may still use C-style strings of character arrays, the C++ string class is superior in almost every way. It handles the memory management; provides some bounds checking, assignment semantics, and comparisons; and supports manipulations such as concatenation, substring extraction, and substring or character replacement.

NOTE Technically, the C++ string is a typedef name for a char instantiation of the basic_string template. However, you need not worry about these details; you can use string as if it were a bona fide non-template class.

C++ provides support for Unicode and localization. These features allow you to write programs that work with different languages, for example Chinese. Locales, defined in <locale>, allow you to format data such as numbers and dates according to the rules of a certain country or region.

In case you missed it, Chapter 2 provides all the details of the string class functionality, while Chapter 18 discusses Unicode and localization.

Regular Expressions

Regular expressions are available through the <regex> header file. They make it easy to perform so-called pattern-matching, often used in text processing. Pattern-matching allows you to search special patterns in strings and optionally replace those with a new pattern. Regular expressions are discussed in Chapter 18.

I/O Streams

C++ introduces a new model for input and output using streams. The C++ library provides routines for reading and writing built-in types from and to files, console/keyboard, and strings. C++ also provides the facilities for coding your own routines for reading and writing your own objects. The I/O functionality is defined in several header files: <fstream>, <iomanip>, <ios>, <iosfwd>, <iostream>, <istream>, <ostream>, <sstream>, <streambuf>, and <strstream>. Chapter 1 reviews the basics of I/O streams, and Chapter 12 discusses streams in detail.

Smart Pointers

One of the problems faced in robust programming is knowing when to delete an object. There are several failures that can happen. A first problem is not deleting the object at all (failing to free the storage). This is known as memory leaks, where objects accumulate and take up space but are not used. Another problem is where someone deletes the storage but others are still pointing to that storage, resulting in pointers to storage that is either no longer in use or has been reallocated for another purpose. These are known asdangling pointers. One more problem is when one piece of code frees the storage, and another piece of code attempts to free the same storage. This is known as double-freeing. All of these tend to result in program failures of some sort. Some failures are readily detected; others cause the program to produce erroneous results. Most of these errors are difficult to discover and repair.

C++ addresses these problems with smart pointers: unique_ptr, shared_ptr, and weak_ptr. shared_ptr and weak_ptr are thread-safe. They are all defined in the <memory> header. These smart pointers are introduced in Chapter 1 and discussed in more detail in Chapter 22.

Before C++11, the functionality of unique_ptr was handled by a type called auto_ptr, which is now deprecated and should not be used anymore. There was no equivalent to shared_ptr in the earlier standard library, although many third-party libraries (for example, Boost) did provide this capability.

Exceptions

The C++ language supports exceptions, which allow functions or methods to pass errors of various types up to calling functions or methods. The C++ standard library provides a class hierarchy of exceptions that you can use in your program as is, or that you can derive from to create your own exception types. Exception support is defined in a couple of header files: <exception>, <stdexcept>, and <system_error>. Chapter 13 covers the details of exceptions and the standard exception classes.

Mathematical Utilities

The C++ library provides some mathematical utility classes. Although they are templatized so that you can use them with any type, they are not generally considered part of the STL.

The standard library provides a complex number class called complex, defined in <complex>, which provides an abstraction for working with numbers that contain both real and imaginary components.

The compile-time rational arithmetic library provides a ratio class template defined in the <ratio> header file. This ratio class can exactly represent any finite rational number defined by a numerator and denominator. This library is discussed in Chapter 19.

The standard library also contains a class called valarray, defined in <valarray>, which is similar to the vector class but is more optimized for high performance numerical applications. The library provides several related classes to represent the concept of vector slices. From these building blocks, it is possible to build classes to perform matrix mathematics. There is no built-in matrix class; however, there are third-party libraries like Boost that include matrix support.

C++ also provides a standard way to obtain information about numeric limits, such as the maximum possible value for an integer on the current platform. In C, you could access #defines, such as INT_MAX. While those are still available in C++, it’s recommended to use the numeric_limits class template defined in the <limits> header file. Its use is straightforward, as shown in the following code:

cout << "Max int value: " << numeric_limits<int>::max() << endl;

cout << "Min int value: " << numeric_limits<int>::min() << endl;

cout << "Lowest int value: " << numeric_limits<int>::lowest() << endl;

cout << "Max double value: " << numeric_limits<double>::max() << endl;

cout << "Min double value: " << numeric_limits<double>::min() << endl;

cout << "Lowest double value: " << numeric_limits<double>::lowest() << endl;

Time Utilities

C++ includes the Chrono library, defined in the <chrono> header file. This library makes it easy to work with time; for example, to time certain durations or to perform actions based on timing. The Chrono library is discussed in detail in Chapter 19.

Random Numbers

C++ already had support for generating pseudo-random numbers for a long time with the srand() and rand() functions. However, those functions provide only very basic random numbers. For example, you could not change the distribution of the generated random numbers.

Since C++11, a random number library has been added to the standard, which is much more powerful. The new library is defined in <random> and comes with random number engines, random number engine adaptors, and random number distributions. All of these can be used to give you random numbers more suited to your problem domain, such as normal distributions, negative exponential distributions, etc.

Consult Chapter 19 for details on this library.

Initializer Lists

Initializer lists are defined in the <initializer_list> header file. They make it easy to write functions that can accept a variable number of arguments and are discussed in Chapter 10.

Pair and Tuple

The <utility> header defines the pair template, which can store two elements with two different types. This is known as storing heterogeneous elements. All standard template library containers discussed further in this chapter store homogenous elements, meaning that all the elements in a container must have the same type. A pair allows you to store elements of completely unrelated types in one object.

tuple defined in <tuple> is a generalization of pair. It is a sequence with a fixed size that can have heterogeneous elements. The number and type of elements for a tuple instantiation is fixed at compile time. Tuples are discussed in Chapter 19.

Function Objects

A class that implements a function call operator is called a function object. Function objects can, for example, be used as predicates for certain STL algorithms. The <functional> header file defines a number of predefined function objects and supports creating new function objects based on existing ones.

Function objects are discussed in detail in Chapter 17 together with a detailed discussion of the STL algorithms.

Multithreading

All major CPU vendors are selling processors with multiple cores. They are being used for everything from servers to consumer computers and even in smartphones. If you want your software to take advantage of all these cores then you need to write multithreaded code. The standard library provides a couple of basic building blocks for writing such code. Individual threads can be created using the thread class from the <thread> header. In multithreaded code you need to take care that several threads are not reading and writing to the same piece of data at the same time. To prevent this, you can use atomics defined in <atomic>, which give you thread-safe atomic access to a piece of data. Other thread synchronization mechanisms are provided by <condition_variable> and <mutex>.

If you just need to calculate something and get the result back with proper exception handling, it’s easier to use async and future defined in the <future> header than it is to directly use the thread class.

Writing multithreaded code is discussed in detail in Chapter 23.

Type Traits

Type traits are defined in the <type_traits> header file and provide information about types at compile time. They are useful when writing advanced templates and are discussed in Chapter 21.

The Standard Template Library

The standard template library (STL) supports various containers and algorithms. This section briefly introduces those containers and algorithms. Later chapters provide the coding details for using them in your programs.

STL Containers

The STL provides implementations of commonly used data structures such as linked lists and queues. When you use C++, you should not need to write such data structures again. The data structures are implemented using a concept called containers, which store information called elements, in a way that implements the data structure (linked list, queue, etc.) appropriately. Different data structures have different insertion, deletion, and access behavior and performance characteristics. It is important to be familiar with the data structures available so that you can choose the most appropriate one for any given task.

All the containers in the STL are templates, so you can use them to store any type, from built-in types such as int and double to your own classes. Each container instance stores objects of only one type; that is, they are homogeneous collections. If you need non fixed-sized heterogeneous collections, you could create a class that has multiple derived classes, and each derived class could wrap an object of your required type.

NOTE The C++ STL containers are homogenous: they allow elements of only one type in each container.

Note that the C++ standard specifies the interface, but not the implementation, of each container and algorithm. Thus, different vendors are free to provide different implementations. However, the standard also specifies performance requirements as part of the interface, which the implementations must meet.

This section provides an overview of the various containers available in the STL.

vector

The <vector> header file defines vector, which stores a sequence of elements and provides random access to these elements. You can think of a vector as an array of elements that grows dynamically as you insert elements, and provides some bounds checking. Like an array, the elements of a vector are stored in contiguous memory.

NOTE A vector in C++ is a synonym for a dynamic array: an array that grows and shrinks automatically in response to the number of elements it stores.

vectors provide fast element insertion and deletion (amortized constant time) at the end of the vector. Amortized constant time insertion means that most of the time insertions are done in constant time O(1) (Chapter 4 explains big-O notation). However, sometimes the vector needs to grow in size to accommodate new elements, which has a complexity of O(N). On average this results in O(1) complexity or amortized constant time. Details are explained in Chapter 16. A vector has slow (linear time) insertion and deletion anywhere else, because the operation must move all the elements “down” or “up” by one to make room for the new element or to fill the space left by the deleted element. Like arrays, vectors provide fast (constant time) access to any of their elements.

You should use a vector in your programs when you need fast access to the elements, but do not plan to often add or remove elements in the middle. A good rule of thumb is to use a vector whenever you would have used an array. For example, a system-monitoring tool might keep a list of computer systems that it monitors in a vector. Only rarely would new computers be added to the list, or current computers removed from the list. However, users would often want to look up information about a particular computer, so lookup times should be fast.

NOTE Use a vector instead of a C-style array whenever possible.

There is a template specialization available for vector<bool> to store Boolean values in a vector. This specialization optimizes space allocation for the Boolean elements; however, the standard does not specify how an implementation of vector<bool> should optimize space. The difference between the vector<bool> specialization and the bitset discussed further in this chapter is that the bitset container is of fixed size.

list

An STL list is a doubly linked list structure and is defined in <list>. Like an array or vector, it stores a sequence of elements. However, unlike an array or vector, the elements of a list are not necessarily in contiguous memory. Instead, each element in the listspecifies where to find the next and previous elements in the list (usually via pointers), hence the name doubly linked list.

The performance characteristics of a list are the exact opposite of a vector. They provide slow (linear time) element lookup and access, but quick (constant time) insertion and deletion of elements once the relevant position has been found. Thus, you should use a list when you plan to insert and remove many elements but do not require quick lookup.

forward_list

The forward_list, defined in <forward_list>, is a singly linked list, compared to the list container, which is doubly linked. The forward_list supports forward iteration only, and requires less memory than a list. Like lists, forward_lists allow constant time insertion and deletion anywhere once the relevant position has been found, and there is no fast random access to elements.

deque

The name deque is an abbreviation for a double-ended queue, although it behaves more like a vector instead of a queue, which is discussed later. A deque, defined in <deque>, provides quick (constant time) element access. It also provides fast (amortized constant time) insertion and deletion at both ends of the sequence, but it provides slow (linear time) insertion and deletion in the middle of the sequence.

You should use a deque instead of a vector when you need to insert or remove elements from either end of the sequence but still need fast access to all elements. However, this requirement does not apply to many programming problems; in most cases a vector or listshould suffice.

array

The <array> header defines array, which is a replacement for standard C-style arrays. Sometimes you know the exact number of elements in your container up front and you don’t need the flexibility of a vector or a list, which are able to grow dynamically to accommodate new elements. array is perfect for such fixed-sized collections and it does not have the same overhead as vector; it’s basically a thin wrapper around standard C-style arrays. There are a number of advantages in using arrays instead of standard C-style arrays; they always know their own size, and do not automatically get cast to a pointer to avoid certain types of bugs. arrays do not provide insertion or deletion; they have a fixed size. The advantage of having a fixed size is that this allows an array to be allocated on the stack, rather than always demanding heap access as vector does. Access to elements is very fast (constant time), just as with vectors.

NOTE The vector, list, forward_list, deque, and array containers are called sequential containers because they store a sequence of elements.

queue

The name queue comes directly from the definition of the English word queue, which means a line of people or objects. The queue container is defined in <queue> and provides standard first in, first out (or FIFO) semantics. A queue is a container in which you insert elements at one end and take them out at the other end. Both insertion (amortized constant time) and removal (constant time) of elements is quick.

You should use a queue structure when you want to model real-life “first-come, first-served” semantics. For example, consider a bank. As customers arrive at the bank, they get in line. As tellers become available, they serve the next customer in line, thus providing “first-come, first-served” behavior. You could implement a bank simulation by storing customer objects in a queue. As customers arrive at the bank, they are added to the end of the queue. As tellers serve customers, they start with customers at the front of the queue.

priority_queue

A priority_queue, also defined in <queue>, provides queue functionality in which each element has a priority. Elements are removed from the queue in priority order. In the case of priority ties, the order in which elements are removed is undefined. priority_queueinsertion and deletion are generally slower than simple queue insertion and deletion, because the elements must be reordered to support the priority ordering.

You can use priority_queues to model “queues with exceptions.” For example, in the preceding bank simulation, suppose that customers with business accounts take priority over regular customers. Many real-life banks implement this behavior with two separate lines: one for business customers and one for everyone else. Any customers in the business queue are taken before customers in the other line. However, banks could also provide this behavior with a single line in which business customers move to the front of the line ahead of any non-business customers. In your program, you could use a priority_queue in which customers have one of two priorities: business or regular. All business customers would be serviced before all regular customers, but each group would be serviced in first-come, first-served order.

stack

The <stack> header defines the stack class, which provides standard first-in, last-out (FILO) semantics, also known as last-in, first-out (LIFO). Like a queue, elements are inserted and removed from the container. However, in a stack, the most recent element inserted is the first one removed. The name stack derives from a visualization of this structure as a stack of objects in which only the top object is visible. When you add an object to the stack, you hide all the objects underneath it.

The stack container provides fast (constant time) insertion and removal of elements. You should use the stack structure when you want FILO semantics. For example, an error-processing tool might want to store errors on a stack so that the most recent error is the first one available for a human administrator to read. Processing errors in a FILO order is often useful because newer errors sometimes obviate older ones.

NOTE Technically, the queue, priority_queue, and stack containers are container adapters. They are simple interfaces built on top of one of the standard sequential containers vector, list, or deque.

set and multiset

The set template is defined in the <set> header file, and, as the name suggests, it is a set of elements, loosely analogous to the notion of a mathematical set: Each element is unique, and there is at most one instance of the element in the set. One difference between the mathematical concept of set, and set as implemented in the STL, is that in the STL the elements are kept in an order. The reason for the order is that when the client enumerates the elements, they come out in the ordering imposed by the type’s operator< or a user-defined comparator. The set provides logarithmic insertion, deletion, and lookup. This means insertions and deletions are faster than for a vector but slower than for a list. Lookups are faster than for a list, but slower than for a vector.

You should use a set when you need the elements to be in an order, have equal amounts of insertion/deletion and lookups, and want to optimize performance for both as much as possible. For example, an inventory-tracking program in a busy bookstore might want to use a set to store the books. The list of books in stock must be updated whenever books arrive or are sold, so insertion and deletion should be quick. Customers also need the ability to look for a specific book, so the program should provide fast lookup as well.

NOTE Use a set instead of a vector or list if you need order and want equal performance for insertion, deletion, and lookup.

Note that a set does not allow duplicate elements. That is, each element in the set must be unique. If you want to store duplicate elements, you must use a multiset, also defined in the <set> header file.

map and multimap

The <map> header defines the map template, which stores key/value pairs. A map keeps its elements in sorted order, based on the key values, not the object values. In all other respects, it is identical to a set. You should use a map when you want to associate keys and values. For example, in the preceding bookstore example, you might want to store the books in a map where the key is the ISBN number of the book and the value is a Book object containing detailed information for that specific book.

A multimap, also defined in <map>, has the same relation to a map as a multiset does to a set. Specifically, a multimap is a map that allows duplicate keys.

Note that you can use a map as an associative array. That is, you can use it as an array in which the index can be any type; for example, a string.

NOTE The set and map containers are called associative containers because they associate keys and values. This term is confusing when applied to sets, because in sets the keys are themselves the values. Both containers sort their elements, so they are called sorted or ordered associative containers.

Unordered Associative Containers / Hash Tables

The STL supports hash tables, also called unordered associative containers. There are four unordered associative containers:

· unordered_map

· unordered_multimap

· unordered_set

· unordered_multiset

The first two are defined in <unordered_map>, the other two in <unordered_set>. Better names would have been hash_map, hash_set, and so on. Unfortunately, hash tables were not part of the C++ standard library before C++11, which means a lot of third-party libraries implemented hash tables themselves by using names with a prefix hash like hash_map. Because of this, the C++ standard committee decided to use the prefix unordered instead of hash to avoid name clashes.

These unordered associative containers behave similar to their ordered counterparts. An unordered_map is similar to a standard map except that the standard map sorts its elements while the unordered_map doesn’t sort its elements.

Insertion, deletion, and lookup with these unordered associative containers can be done on average in constant time. In a worst-case scenario it will be in linear time. Lookup of elements in an unordered container can be much faster than with a normal map or set, especially when there are lots of elements in the container.

Chapter 16 explains how these unordered associative containers work and why they are also called hash tables.

bitset

C and C++ programmers commonly store a set of flags in a single int or long, using one bit for each flag. They set and access these bits with the bitwise operators: &, |, ^, ~, <<, and >>. The C++ standard library provides a bitset class that abstracts this bit field manipulation, so you shouldn’t need to use the bit manipulation operators anymore.

The <bitset> header file defines the bitset container, but this is not a container in the normal sense, in that it does not implement a specific data structure in which you insert and remove elements; they have a fixed size and don’t support iterators. You can think of them as a sequence of Boolean values that you can read and write. However, unlike the normal way this is handled in C programming, the bitset is not limited to the size of an int or other elementary data types. Thus, you can have a 40-bit bitset, or a 213-bit bitset. The implementation will use as much storage as it needs to implement N bits when you declare your bitset with bitset<N>.

Summary of STL Containers

The following table summarizes the containers provided by the STL. It uses the big-O notation introduced in Chapter 4 to present the performance characteristics on a container of N elements. An N/A entry in the table means that the operation is not part of the container semantics.

CONTAINER CLASS NAME

CONTAINER TYPE

INSERTION PERFORMANCE

DELETION PERFORMANCE

LOOKUP PERFORMANCE

vector

Sequential

Amortized O(1) at end; O(N-p) for an insert at position p

O(1) at end; O(N-p) for a delete at position p

O(1)

When to Use: Need quick lookup. Don’t mind slower insertion/deletion. Whenever you would use a standard C-style array that should dynamically grow/shrink in size.

list

Sequential

O(1) once you are at the position where to insert the element

O(1) once you are at the position where to delete the element

O(N); Statistically O(N/2)

When to Use: Need quick insertion/deletion. Don’t mind slower lookup.

forward_list

Sequential

O(1) once you are at the position where to insert the element

O(1) once you are at the position where to delete the element

O(N); Statistically O(N/2)

When to Use: When you need the benefits of a list but require only forward iteration.

deque

Sequential

Amortized O(1) at beginning or end; O(min(N-p, p)) for an insert at position p

O(1) at beginning or end; O(min(N-p, p)) for a delete at position p

O(1)

When to Use: Not usually needed; use a vector or list instead.

array

Sequential

N/A

N/A

O(1)

When to Use: When you need a fixed-size array to replace a standard C-style array.

queue

Container Adapter

Depends on the underlying container; O(1) for list, amortized O(1) for deque

Depends on the underlying container; O(1) for list and deque

N/A

When to Use: When you want a FIFO structure

priority_queue

Container Adapter

Depends on the underlying container; amortized O(log(N )) for vector and deque

Depends on the underlying container; O(log(N)) for vector and deque

N/A

When to Use: When you want a queue with priority.

stack

Container Adapter

Depends on the underlying container; O(1) for list, amortized O(1) for vector and deque

Depends on the underlying container; O(1) for list, vector and deque

N/A

When to Use: When you want a FILO/LIFO structure.

set
multiset

Sorted Associative

O(log(N))

O(log(N))

O(log(N))

When to Use: When you want a sorted collection of elements with equal lookup, insertion, and deletion times.

map
multimap

Sorted Associative

O(log(N))

O(log(N))

O(log(N))

When to Use: When you want a sorted collection to associate keys with values with equal lookup, insertion, and deletion times.

unordered_map
unordered_multimap

Unordered associative / hash table

Average case O(1); worst case O(N)

Average case O(1); worst case O(N)

Average case O(1); worst case O(N)

When to Use: When you want to associate keys with values with equal lookup, insertion, and deletion times and don’t require the elements to be sorted. Performance can be better than with a normal map but that depends on the elements.

unordered_set
unordered_multiset

Unordered associative / hash table

Average case O(1); worst case O(N)

Average case O(1); worst case O(N)

Average case O(1); worst case O(N)

When to Use: When you want a collection of elements with equal lookup, insertion, and deletion times and don’t require the elements to be sorted. Performance can be better than with a normal set but that depends on the elements.

bitset

Special

N/A

N/A

O(1)

When to Use: When you want a collection of flags.

Note that strings are technically containers as well. They can be thought of as vectors of characters. Thus, some of the algorithms described in the material that follows also work on strings.

NOTE vector should be your default container. In practice, insertion and deletion in a vector is often faster than in a list or forward_list. This is because of how memory and caches work on modern CPUs, and because of the fact that for a list orforward_list you first need to iterate to the position where you want to insert or delete an element. Memory for a list or forward_list might be fragmented, so iteration is slower than for a vector.

STL Algorithms

In addition to containers, the STL provides implementations of many generic algorithms. An algorithm is a strategy for performing a particular task, such as sorting or searching. These algorithms are also implemented as templates, so they work on most of the different container types. Note that the algorithms are not generally part of the containers. The STL takes the approach of separating the data (containers) from the functionality (algorithms). Although this approach seems counter to the spirit of object-oriented programming, it is necessary in order to support generic programming in the STL. The guiding principle of orthogonality maintains that algorithms and containers are independent, with (almost) any algorithm working with (almost) any container.

NOTE Although the algorithms and containers are theoretically independent, some containers provide certain algorithms in the form of class methods because the generic algorithms do not perform well on those particular containers. For example,sets provide their own find() algorithm that is faster than the generic find() algorithm. You should use the container-specific method form of the algorithm, if provided, because it is generally more efficient or appropriate for the container at hand.

Note that the generic algorithms do not work directly on the containers. They use an intermediary called an iterator. Each container in the STL provides an iterator that supports traversing the elements in the container in a sequence. The different iterators for the various containers adhere to standard interfaces, so algorithms can perform their work by using iterators without worrying about the underlying container implementation. The <iterator> header defines a number of helper functions that return specific iterators for containers:

FUNCTION NAME

FUNCTION SYNOPSIS

begin()
end()

Returns a non-const iterator to the first and one past the last element in a sequence.

image

cbegin
()
cend()

Returns a const iterator to the first and one past the last element in a sequence.

image

rbegin
()
rend()

Returns a non-const reverse iterator to the last and one before the first element in a sequence.

image

crbegin
()
crend()

Returns a const reverse iterator to the last and one before the first element in a sequence.

This section gives an overview of what kind of algorithms are available in the STL without giving all the fine points. Consult a Standard Library Reference; for example, http://www.cppreference.com/ or http://www.cplusplus.com/reference/, for the exact prototypes of all the algorithms. The following chapters go deeper in on iterators, algorithms, and containers with coding examples.

NOTE Iterators mediate between algorithms and containers. They provide a standard interface to traverse the elements of a container in sequence, so that any algorithm can work on any container.

There are approximately 90 algorithms in the STL depending on how you count them. The following sections divide these algorithms into different categories. The algorithms are defined in the <algorithm> header file unless otherwise noted. Note that whenever the following algorithms are specified as working on a “sequence” of elements, that sequence is presented to the algorithm via iterators.

NOTE When examining the list of algorithms, remember that the STL is designed with generality in mind, so it adds generality that might never be used, but which, if required, would be essential. You may not need every algorithm, or need to worry about the more obscure parameters that are there for anticipated generality. It is important only to be aware of what’s available in case you ever find it useful.

Non-Modifying Sequence Algorithms

The non-modifying algorithms are those that look at a sequence of elements and return some information about the elements. As “non-modifying” algorithms, they cannot change the values of elements or the order of elements within the sequence. This category contains three types of algorithms. The following tables list and provide brief summaries of the various non-modifying algorithms. With these algorithms, you should rarely need to write a for loop to iterate over a sequence of values.

Search Algorithms

These algorithms do not require the sequence to be sorted.

ALGORITHM NAME

ALGORITHM SYNOPSIS

COMPLEXITY

adjacent_find()

Finds the first instance of two consecutive elements that are equal to each other or are equivalent to each other as specified by a predicate.

Linear

find()
find_if()

Finds the first element that matches a value or causes a predicate to return true.

Linear

find_first_of()

Like find, except searches for one of several elements at the same time.

Quadratic

find_if_not()

Finds the first element that causes a predicate to return false.

Linear

search()
find_end()

Finds the first (search()) or last (find_end()) subsequence in a sequence that matches another sequence or whose elements are equivalent, as specified by a predicate.

Quadratic

search_n()

Finds the first instance of n consecutive elements that are equal to a given value or relate to that value according to a predicate.

Linear

Comparison Algorithms

The following comparison algorithms are provided. None of them require the source sequences to be ordered. All of them have a linear worst-case complexity.

ALGORITHM NAME

ALGORITHM SYNOPSIS

equal()

Determines if two sequences are equal by checking if parallel elements are equal or match a predicate.

mismatch()

Returns the first element in each sequence that does not match the element in the same location in the other sequence.

lexicographical_compare()

Compares two sequences to determine their “lexicographical” ordering. Compares each element of the first sequence with its equivalent element in the second. If one element is less than the other, that sequence is lexicographically first. If the elements are equal, compares the next elements in order.

Utility Algorithms

ALGORITHM NAME

ALGORITHM SYNOPSIS

all_of()

Returns true if the predicate returns true for all the elements in the sequence or if the sequence is empty; false otherwise.

any_of()

Returns true if the predicate returns true for at least one element in the sequence; false otherwise.

none_of()

Returns true if the predicate returns false for all the elements in the sequence or if the sequence is empty; false otherwise.

count()
count_if()

Counts the number of elements matching a value or that cause a predicate to return true.

Modifying Sequence Algorithms

The modifying algorithms modify some or all of the elements in a sequence. Some of them modify elements in place, so that the original sequence changes. Others copy the results to a different sequence so that the original sequence is unchanged. All of them have a linear worst-case complexity. The following table summarizes the modifying algorithms:

ALGORITHM NAME

ALGORITHM SYNOPSIS

copy()
copy_backward()

Copies elements from one sequence to another.

copy_if()

Copies elements for which a predicate returns true from one sequence to another.

copy_n()

Copies n elements from one sequence to another.

fill()

Sets all elements in the sequence to a new value.

fill_n()

Sets the first n elements in the sequence to a new value.

generate()

Calls a specified function to generate a new value for each element in the sequence.

generate_n()

Calls a specified function to generate a new value for the first n elements in the sequence.

move()
move_backward()

Moves elements from one sequence to another. This uses efficient move semantics.

remove()
remove_if()
remove_copy()
remove_copy_if()

Removes elements that match a given value or that cause a predicate to return true, either in place or by copying the results to a different sequence.

replace()
replace_if()
replace_copy()
replace_copy_if()

Replaces all elements matching a value or that cause a predicate to return true with a new element, either in place or by copying the results to a different sequence.

reverse()
reverse_copy()

Reverses the order of the elements in the sequence, either in place or by copying the results to a different sequence.

rotate()
rotate_copy()

Swaps the first and second “halves” of the sequence, either in place or by copying the results to a different sequence. The two subsequences to be swapped need not be equal in size.

shuffle()
random_shuffle()

Shuffles the sequence by randomly reordering the elements. It is possible to specify the properties of the random number generator used for shuffling. random_shuffle() is deprecated since C++14.

transform()

Calls a unary function on each element of a sequence or a binary function on parallel elements of two sequences.

unique()
unique_copy()

Removes consecutive duplicates from the sequence, either in place or by copying results to a different sequence.

Operational Algorithms

ALGORITHM NAME

ALGORITHM SYNOPSIS

for_each()

Executes a function on each element in the sequence. This algorithm has a linear complexity and does not require the source sequence to be ordered.

Swap Algorithms

ALGORITHM NAME

ALGORITHM SYNOPSIS

iter_swap()
swap_ranges()

Swaps two elements or sequences of elements.

swap()

Swaps two values, defined in the <utility> header.

Partition Algorithms

ALGORITHM NAME

ALGORITHM SYNOPSIS

COMPLEXITY

is_partitioned()

Returns true if all elements for which a predicate returns true are before all elements for which it returns false.

Linear

partition()

Sorts the sequence such that all elements for which a predicate returns true are before all elements for which it returns false, without preserving the original order of the elements within each partition.

Linear

stable_partition()

Sorts the sequence such that all elements for which a predicate returns true are before all elements for which it returns false, while preserving the original order of the elements within each partition.

Linear Logarithmic

partition_copy()

Copies elements from one sequence to two different sequences. The target sequence is selected based on the result of a predicate, either true or false.

Linear

partition_point()

Returns an iterator such that all elements before this iterator return true for a predicate and all elements after this iterator return false for that predicate.

Logarithmic

Sorting Algorithms

The STL provides several different sorting algorithms with varying performance guarantees.

ALGORITHM NAME

ALGORITHM SYNOPSIS

COMPLEXITY

is_sorted()
is_sorted_until()

Checks if a sequence is sorted or which subsequence is sorted.

Linear

nth_element()

Relocates the nth element of the sequence such that the element in the position pointed to by nth is the element that would be in that position if the whole range were sorted, and it rearranges all elements such that all elements preceding the nth element are less than the new nth element, and the ones following it are greater than the new nth element.

Linear

partial_sort()
partial_sort_copy()

Partially sorts the sequence: The first n elements (specified by iterators) are sorted; the rest are not. They are sorted either in place or by copying them to a new sequence.

Linear Logarithmic

sort()
stable_sort()

Sorts elements in place, either preserving the order of duplicate elements or not.

Linear Logarithmic

Binary Search Algorithms

The following binary search algorithms require the sequence to be at least partitioned on the element that is searched for. This could, for example, be achieved by applying std::partition(). A sorted sequence also meets this requirement. All these algorithms have logarithmic complexity.

ALGORITHM NAME

ALGORITHM SYNOPSIS

lower_bound()
upper_bound()
equal_range()

Finds the beginning, (lower_bound()), end (upper_bound()), or both sides (equal_range()) of the range including a specified element.

binary_search()

Finds a value in a sequence.

Set Algorithms

Set algorithms are special modifying algorithms that perform set operations on sequences. They are most appropriate on sequences from set containers, but work on sorted sequences from most containers.

ALGORITHM NAME

ALGORITHM SYNOPSIS

COMPLEXITY

inplace_merge()

Merges two sorted sequences in place.

Linear Logarithmic

merge()

Merges two sorted sequences by copying them to a new sequence.

Linear

includes()

Determines if every element from one sequence is in another sequence.

Linear

set_union()
set_intersection()
set_difference()
set_symmetric_difference()

Performs the specified set operation on two sorted sequences, copying results to a third sorted sequence.

Linear

Heap Algorithms

A heap is a standard data structure in which the elements of an array or sequence are ordered in a semi-sorted fashion so that finding the “top” element is quick. Six algorithms allow you to work with heaps.

ALGORITHM NAME

ALGORITHM SYNOPSIS

COMPLEXITY

is_heap()

Checks if a range of elements is a heap.

Linear

is_heap_until()

Finds the largest subrange in the given range of elements that is a heap.

Linear

make_heap()

Creates a heap from a range of elements.

Linear

push_heap()
pop_heap()

Adds or removes an element from the heap.

Logarithmic

sort_heap()

Converts the heap into a range of ascending sorted elements.

Linear Logarithmic

Minimum/Maximum Algorithms

ALGORITHM NAME

ALGORITHM SYNOPSIS

min()
max()

Returns the minimum or maximum of two or more values.

minmax()

Returns the minimum and maximum of two or more values as a pair.

min_element()
max_element()

Finds the minimum or maximum element in a sequence.

minmax_element()

Finds the minimum and maximum element in a sequence and returns the result as a pair.

Numerical Processing Algorithms

The <numeric> header provides the following numerical processing algorithms. None of them require the source sequences to be ordered. All of them have a linear complexity.

ALGORITHM NAME

ALGORITHM SYNOPSIS

accumulate()

“Accumulates” the values of all the elements in a sequence. The default behavior is to sum the elements, but the caller can supply a different binary function instead.

adjacent_difference()

Generates a new sequence in which each element is the difference (or other binary operation) of the parallel element, and its predecessor, in the source sequence.

inner_product()

Similar to accumulate(), but works on two sequences. Calls a binary function (multiplication by default) on parallel elements in the sequences, accumulating the result using another binary function (addition by default). If the sequences represent mathematical vectors, the algorithm calculates the dot product of the vectors.

iota()

Fills a sequence with successively incrementing values starting with a given value.

partial_sum()

Generates a new sequence in which each element is the sum (or other binary operation) of the parallel element, and all preceding elements, in the source sequence.

Permutation Algorithms

ALGORITHM NAME

ALGORITHM SYNOPSIS

COMPLEXITY

is_permutation()

Returns true if the elements in one range are a permutation of the elements in another range.

Quadratic

next_permutation()
prev_permutation()

Modifies the sequence by transforming it into its lexicographical “next” or “previous” permutation. Successive calls to one or the other will permute the sequence into all possible permutations of its elements if you start with a properly sorted sequence. Returns false if no more permutations exist.

Linear

Choosing an Algorithm

The number and capabilities of the algorithms might overwhelm you at first. It can also be difficult to see how to apply them in the beginning. However, now that you have an idea of the available options, you are better able to tackle your program designs. The next chapters cover the details of how to use these algorithms in your code.

What’s Missing from the STL

The STL is powerful, but it’s not perfect. Here is a list of omissions and unsupported functionality:

· The STL does not guarantee any thread safety for accessing containers simultaneously from multiple threads.

· The STL does not provide any generic tree or graph structures. Although maps and sets are generally implemented as balanced binary trees, the STL does not expose this implementation in the interface. If you need a tree or graph structure for something like writing a parser, you will need to implement your own or find an implementation in another library.

It is important to keep in mind that the STL is extensible. You can write your own containers or algorithms that will work with existing algorithms or containers. So, if the STL doesn’t provide exactly what you need, consider writing your desired code such that it works with the STL. Chapter 20 covers the topic of customizing and extending the STL.

SUMMARY

This chapter provided an overview of the C++ standard library, which is the most important library that you will use in your code. It subsumes the C library and includes additional facilities for strings, I/O, error handling, and other tasks. It also includes generic containers and algorithms, which are together referred to as the standard template library (STL). The next chapters describe the standard template library in more detail.