Fundamental Programming Concepts - From Mathematics to Generic Programming (2015)

From Mathematics to Generic Programming (2015)

10. Fundamental Programming Concepts

All humans naturally desire to know.
Aristotle, Metaphysics I, 1

In this chapter we will introduce some of the important ideas associated with generic programming, including concepts and iterators. We’ll also consider some common programming tasks that rely on them. But we’ll start by looking at the origins of the notion of abstraction.

10.1 Aristotle and Abstraction

The School of Athens, a famous painting by Italian Renaissance painter Raphael, depicts many ancient Greek philosophers (see detail of painting on the next page). At the center are Plato and Aristotle, the two most important philosophers of the ancient world. Plato is pointing upward, while his student Aristotle holds his hand out over the ground. According to popular interpretation, Plato is pointing to the heavens, indicating that we should contemplate the eternal, while Aristotle is indicating that we need to study the world. In fact, it could be said that Plato invented mathematics and Aristotle invented the study of everything else, especially science. Aristotle’s works cover everything from aesthetics to zoology.

Image


Aristotle (384 BC–322 BC)

Image

Aristotle came from Stageira, a city in the far north of Greece. We know very little about his early life, but we know that at some point he decided to move to Athens in search of wisdom. He studied (and at some point, probably taught) at Plato’s Academy for about 20 years. Around the time of Plato’s death—perhaps disappointed because he was not appointed Plato’s successor—he left Athens.

In 343 BC, King Philip of Macedon appointed Aristotle to be a tutor for his son Alexander and Alexander’s companions. We don’t know much about Aristotle’s relationship with the prince, but in later years when Alexander became king and began the Asian conquests that earned him the nickname “the Great,” he sent Aristotle exotic plant and animal specimens for the philosopher’s collection.

Around 335 BC, Aristotle returned to Athens and created his own great school, the Lyceum. During the next 12 years, he produced the most astonishing collection of knowledge ever assembled. While his teacher Plato had focused on studying eternal truths, Aristotle wanted to understand the world as it really was. For example, when Plato wanted to write about politics, he described what the ideal society would be. When Aristotle wrote about politics, he asked his students to visit each of the important Greek city-states and then report on their constitutions. Aristotle’s approach was to observe everything, describe it, and come up with explanations for what he saw.

Aristotle taught, and wrote about, nearly every subject imaginable. According to several important writers of the period, Aristotle wrote beautifully; unfortunately the books he intended for publication, such as his dialogues, have all been lost. What we have instead are terse treatises that were probably lecture notes. Nevertheless, many of his works, such as Nicomachean Ethics, Politics, and Metaphysics are still essential reading today. And regardless of their style, his collected works constitute the first encyclopedic treatment of knowledge.

Although Aristotle’s scientific descriptions include factual errors, he was the first person to systematically describe the scientific world, with observations on topics as detailed as how the octopus reproduces.

Aristotle left Athens around 322 BC and died soon afterward. While other philosophical traditions (e.g., Stoicism, Platonism) persisted in ancient times long after their founders’ deaths, Aristotle’s did not. Greek philosophy became increasingly introspective and lost interest in Aristotle’s idea of studying observable reality. The Lyceum rapidly declined in importance, and few scholars in later Greek and Roman times considered themselves Aristotelians. Even when his work was rediscovered in the Middle Ages, medieval scholars slavishly studied his writings rather than following his methodology of going out and observing the world. Aristotle’s great legacy is his empirical approach, which is the foundational principle of all modern science. Furthermore, the organization of knowledge reflected in the departments of modern universities is a direct descendent of the taxonomy that Aristotle proposed.


Aristotle’s writings were preserved toward the end of the first millennium AD by Arab philosophers. In the 12th century, when Christian kingdoms recaptured much of Spain from the Islamic state known as Al-Andalus, they found a library in the Spanish city of Toledo containing a great number of books in Arabic, including translations of Aristotle’s works. Eventually the books were translated from the original Greek into Latin, and an Aristotelian renaissance spread through Europe. Aristotle became known as simply “the Philosopher,” and his works became a part of generally accepted knowledge. The great Andalusian philosopher Ibn Rushd (known as Averroes to Europeans) wrote widely read commentaries on Aristotle, reconciling his philosophy with the teachings of Islam; he became known simply as “the Commentator.” In the 13th century, Christian scholars such as Thomas Aquinas and Duns Scotus similarly showed that Aristotelianism was compatible with Christianity; this was often described as “baptizing” Aristotle. As a result, Aristotle’s works were part of the required study at European universities for literally hundreds of years.

Among Aristotle’s most important works is the Organon, a collection of six treatises on various aspects of logic that would define the field for the next 2600 years.1 In Categories, the first part of the Organon, Aristotle introduced the notion of abstraction. He wrote about the distinction between an individual, a species, and a genus. While today most people think of these as biological distinctions, for Aristotle they applied to everything. A species incorporates all the “essential” properties of a type of thing. A genus may contain many species, each identified by its differentia—the things that differentiate it from other species in the genus.

1 Unlike Aristotle’s other works, a Latin version of the Organon was available to Europeans much earlier; Boethius provided the translation in the early 6th century.

It is Aristotle’s idea of genus that inspired the term generic programming—a way to think about programming that focuses on the level of genera (the plural of genus) rather than species.

10.2 Values and Types

Now we will see how some of the ideas we’ve been discussing fit in computer programming. First, we need a few definitions:

Definition 10.1. A datum is a sequence of bits.

01000001 is an example of a datum.

Definition 10.2. A value is a datum together with its interpretation.

A datum without an interpretation has no meaning. The datum 01000001 might have the interpretation of the integer 65, or the character “A,” or something else altogether. Every value must be associated with a datum in memory; there is no way to refer to disembodied values in a language like C++ or Java.

Definition 10.3. A value type is a set of values sharing a common interpretation.

Definition 10.4. An object is a collection of bits in memory that contain a value of a given value type.

There is nothing in the definition that says that all the bits of an object must be contiguous. In fact, it’s quite common for parts of an object to be located at different places in memory; these are called remote parts.

An object is immutable if the value never changes, and mutable otherwise. An object is unrestricted if it can contain any value of its value type.

Definition 10.5. An object type is a uniform method of storing and retrieving values of a given value type from a particular object when given its address.

What we call “types” in programming languages are object types. C++, Java, and other programming languages do not provide mechanisms for defining value types.2 Every type resides in memory and is an object type.

2 The iterator trait value_type in C++, despite its name, does not actually return a value type in the sense described here. Rather, it returns the object type of the value pointed to by the iterator.

10.3 Concepts

The essence of generic programming lies in the idea of concepts. A concept is a way of describing a family of related object types. The relationship between concept and type is exactly the relationship between theory and model in mathematics, and between genus and species in the scientific taxonomy introduced by Aristotle.

Image

Here are some examples of concepts and some of their types in C++:

Integral:3 int8_t, uint8_t, int16_t, ...

3 This refers specifically to the list of built-in C++ integral types. There is a broader concept Integer that includes all of these plus other representations of integers such as infinite-precision integers.

UnsignedIntegral: uint8_t, uint16_t, ...

SignedIntegral: int8_t, int16_t, ...

While concepts exist in many languages implicitly, very few languages provide an explicit way to talk about them.4

4 There have been proposals to include concepts in C++; the work is still in progress. There are also concept-like features in some functional programming languages such as Haskell.

Many programming languages provide a mechanism to specify the interface of a type to be implemented later: abstract classes in C++, interfaces in Java, and so on. However, these mechanisms completely specify the interface, including strict requirements on the types of arguments and return values. In contrast, concepts allow interfaces to be specified in terms of families of related types. For example, in both Java and C++, you can specify an interface containing a function size() returning a value of type int32. In the world of concepts, you can have an interface with a function size() that returns a value of any integral type—uint8, int16, int64, etc.

A concept can be viewed as a set of requirements on types, or as a predicate5 that tests whether types meet those requirements. The requirements concern

5 A predicate is a function that returns true or false.

• The operations the types must provide

• Their semantics

• Their time/space complexity

A type is said to satisfy a concept if it meets these requirements.

When first encountering concepts, programmers often wonder why the third requirement, for space/time complexity, is included. Isn’t complexity just an implementation detail? To answer this question, consider a real-world example. Suppose you defined the abstract data type stack, but you implemented it as an array, in such a way that every time you pushed something onto the array, you had to move every existing element to the next position to make room. Instead of pushing on the stack being a fast operation (constant time), it’s now a slow operation (linear time). This violates a programmer’s assumption of how stacks should behave. In a sense, a stack that doesn’t have fast push and pop is not really a stack. Therefore these very basic complexity constraints are part of what it means to satisfy a concept.

* * *

Two useful ideas related to concepts are type functions and type attributes. A type function is a function that, given a type, returns an affiliated type. For example, it would be nice to have type functions like this:

• value_type(Sequence)

• coefficient_type(Polynomial)

• ith_element_type(Tuple, size_t)

Unfortunately, mainstream programming languages do not contain type functions, even though they would be easy to implement. (After all, the compiler already knows things like the type of elements in a sequence.)

A type attribute is a function that, given a type, returns a value representing one of its attributes. For example:

• sizeof

• alignment_of

• Number of members in a struct

• Name of the type

Some languages provide some type attributes, like sizeof() in C and C++.

* * *

Let’s look at some very general concepts. The first one is Regular,6 which we introduced back in Chapter 7. Roughly speaking, a type is regular if it supports these operations:

6 By convention, we write concept names with initial capitals, and display them with a Sans Serif typeface.

• Copy construction

• Assignment

• Equality

• Destruction

Having a copy constructor implies having a default constructor, since T a(b) should be equivalent to T a; a = b;. To describe the semantics of Regular, we’ll express the requirements as axioms:

Image

The first axiom says that if you copy construct a from b, then anything that was equal to b will now also be equal to a. The second axiom says that if you assign b to a, then anything that was equal to b will now also be equal to a. The third axiom uses the notion of a regular function (not to be confused with a regular type), which is one that produces equal results given equal inputs. It’s the responsibility of the programmer to specify which functions are supposed to be regular; only then can other programmers (or, in the future, the compiler) rely on the fact that these functions will preserve equality.

The complexity requirements on Regular are that each operation is no worse than linear in the area of the object, where area is defined as all space occupied by the object, both its header and its remote parts, both its data and its connectors.7

7 For a more formal treatment of the concept Regular, see Elements of Programming Section 1.5.

The concept Regular is universal—it’s not specific to any programming language. A type in any language is regular if it satisfies the requirements.

A related concept is Semiregular, which is just like Regular except that equality is not explicitly defined. This is needed in a few situations where it is very difficult to implement an equality predicate. Even in these situations, equality is assumed to be implicitly defined, so that axioms that control copying and assignment are still valid. After all, as we saw earlier, the meaning of assigning a to b is that b’s value will be equal to a’s value afterward.

10.4 Iterators

An iterator is a concept used to express where we are in a sequence. In fact, iterators were originally going to be called “coordinates” or “positions”; they may be viewed as a generalization of pointers. In some programming languages, what they call iterators are heavyweight bundles of functionality, but the concept of an iterator just expresses this very simple notion of position.

To be an iterator, a type must support three operations:

• Regular type operations

• Successor

• Dereference

One way to think of an iterator is “something that lets you do linear search in linear time.” The essence of an iterator is the notion of successor. Indeed, iterators come to us directly from Peano’s axioms; essentially, the concept Iterator is “a theory with successor.” However, our iterator concepts will be less strict, because we don’t require all of Peano’s axioms. For example, in Peano arithmetic, every number has a successor, while with iterators, this is not always the case—sometimes we get to the end of our data. Peano also tells us that if successors are equal, the predecessors must be equal, and that we can’t have loops. These requirements are also not the case for programmers; we’re allowed to have data structures that link back to earlier elements and form loops. In fact, this is sometimes exactly what we need to do a computational task efficiently.

The second iterator operation, dereferencing, is a way to get from an iterator to its value. Dereferencing has a time complexity requirement; it’s assumed to be “fast,” which means that there is not a faster way of getting to data than through the iterator. Iterators are sometimes bigger than traditional pointers, in situations where they need to store some additional state to enable fast navigation. Iterators may also support special values indicating that we are past the end of the object, as well as singular values like the null pointer that cannot be dereferenced. It’s okay that dereferencing is a partial function (i.e., that it isn’t defined for all values); after all, mathematicians have no trouble saying what division is, even though division by zero is not defined.

Dereferencing and successor are closely connected,8 and this relationship imposes the following restrictions:

8 See Chapter 6 of Elements of Programming for more about the relationship between dereferencing and successor.

• Dereferencing is defined on an iterator if and only if successor is defined.

• If you are not at the end of the range, you can dereference.

Why do we need equality as a requirement for iterators? In other words, why do we need iterators to be regular rather than semiregular? Because we need to be able to see when one iterator reaches another.

10.5 Iterator Categories, Operations, and Traits

There are several kinds of iterators, which we call iterator categories. Here are the most important:

Input iterators support one-directional traversal, but only once, as is found in single-pass algorithms. The canonical model of an input iterator is the position in an input stream. Bytes are coming over the wire and we can process them one at a time, but once they are processed, they are gone. In particular, with input iterators i == j does not imply ++i == ++j; for example, if you’ve already consumed a character from an input stream, you can’t consume the same character again with a different iterator. Keep in mind that just because an algorithm only requiresinput iterators does not mean it is limited to operating on input streams.

Forward iterators also support only one-directional traversal, but this traversal can be repeated as needed, as in multi-pass algorithms. The canonical model of a forward iterator is the position in a singly linked list.9

9 We assume that link structure of the list is not modified as it is traversed.

Bidirectional iterators support bidirectional traversal, repeated as needed (i.e., they also can be used in multi-pass algorithms). The canonical model of a bidirectional iterator is the position in a doubly linked list. Bidirectional iterators have an invertible successor function: if an element x has a successor y, then y has a predecessor.

Random-access iterators support random-access algorithms; that is, they allow access to any element in constant time (both far and fast). The canonical model is the position in an array.

In addition, there is another common iterator category that behaves differently from the others:

Output iterators support alternating successor (++) and dereference (*) operations, but the results of dereferencing an output iterator can appear only on the left-hand side of an assignment operator, and they provide no equality function. The canonical model of an output iterator is the position in an output stream. We can’t define equality because we can’t even get to the elements once they’ve been output.

While the iterators described so far are the only ones included in C++, other useful iterator concepts also exist:

Linked iterators work in situations where the successor function is mutable (for example, a linked list where the link structure is modified).

Segmented iterators are for cases where the data is stored in noncontiguous segments, each containing contiguous sequences. std::deque, a data structure that is implemented as a segmented array, would immediately benefit; instead of needing each successor operation to check whether the end of the segment has been reached, a “top level” iterator could find the next segment and know its bounds, while the “bottom level” iterator could iterate through that segment.

Iterators like these can easily be implemented. Just because a concept is not built into the language does not mean it’s not useful. In general, STL should be viewed as a set of well-chosen examples, not an exhaustive collection of all useful concepts, data structures and algorithms.

* * *

A simple but important thing we may want to do is find the distance between two iterators. For an input iterator, we might write our distance() function like this:

template <InputIterator I>
DifferenceType<I> distance(I f, I l, std::input_iterator_tag) {
// precondition: valid_range(f, l)
DifferenceType<I> n(0);
while (f != l) {
++f;
++n;
}
return n;
}

There are three notable things about this code: the use of the type function DifferenceType, the use of the iterator tag argument, and the precondition. We’ll discuss all of these soon, but before we do, let’s compare this to a different implementation—one that’s optimized for random access iterators:

template <RandomAccessIterator I>
DifferenceType<I> distance(I f, I l,
std::random_access_iterator_tag) {
// precondition: valid_range(f, l)
return l - f;
}

Since we have random access, we don’t have to repeatedly increment (and count) from one iterator to the other; we can just use a constant time operation—subtraction—to find the distance.

The difference type of an iterator is an integral type that is large enough to encode the largest possible range. For example, if our iterators were pointers, the difference type in C++ could be ptrdiff_t. But in general we don’t know in advance which type the iterator will be, so we need a type function to get the difference type. Although C++ does not have a general mechanism for type functions, STL iterators have a special set of attributes known as iterator traits, one of which gives us the difference type. The complete set of iterator traits is

• value_type

• reference

• pointer

• difference_type

• iterator_category

We’ve mentioned value_type before; it returns the type of the values pointed to by the iterator. The reference and pointer traits are rarely used in current architectures,10 but the others are very important.

10 Earlier versions of the Intel processor architecture included different types for shorter and longer pointers, so it was important to know which to use for a given iterator. Today, if the value type of an iterator is T, the pointer iterator trait would normally be T*.

Since the syntax for accessing iterator traits is rather verbose, we’ll implement our own type function for accessing difference_type, with the using construct of C++11. (See Appendix C for more information about using.)

template <InputIterator I>
using DifferenceType =
typename std::iterator_traits<I>::difference_type;

This gives us the DifferenceType type function used in the earlier code.

The iterator trait iterator_category returns a tag type representing the kind of iterator we’re dealing with. Objects of these tag types contain no data. As we did for DifferenceType, we define the following type function:

template <InputIterator I>
using IteratorCategory =
typename std::iterator_traits<I>::iterator_category;

Now we can return to the use of the iterator tag argument in the distance functions. The iterator tags shown in the examples (input_iterator_tag and random_access_iterator_tag) are possible values of the iterator category trait, so by including them as arguments, we are distinguishing the type signature of the two function implementations. (We will see more examples of this in Chapter 11.) This allows us to perform category dispatch on the distance function; that is, we can write a general form of the function for any iterator category, and the fastest one will be invoked:

template <InputIterator I>
DifferenceType<I> distance(I f, I l) {
return distance(f, l, IteratorCategory<I>());
}

Note that the third argument is actually a constructor call creating an instance of the appropriate type, because we cannot pass types to functions. When the client calls distance(), it uses the two-argument version shown here. That function then invokes the implementation that matches the iterator category. This dispatch happens at compile time and the general function is inline, so there is literally no performance penalty for choosing the right version of the function.

The use of tag types as arguments to distinguish versions of the function may seem redundant, since we already specified different concepts in the templates. However, recall that our use of concepts serves only as documentation for the programmer; current C++ compilers don’t know anything about concepts. Once concepts are added to the language, the arcane iterator category tag mechanism will no longer be needed.

10.6 Ranges

A range is a way of specifying a contiguous sequence of elements. Ranges can be either semi-open or closed;11 a closed range [i, j] includes items i and j, while a semi-open range [i, j) includes i but ends just before j. It turns out that semi-open ranges are the most convenient for defining interfaces. This is because algorithms that operate on sequences of n elements need to be able to refer to n + 1 positions. For example, there are n + 1 places to insert a new item: before the first element, between any two elements, or after the last element. Also, semi-open ranges, unlike closed ranges, can describe an empty range. Furthermore, a semi-open empty range can be specified at any position; it provides more information than a simple “nil” or empty list.

11 In mathematics, there are also open ranges, but they are less useful in programming, so we do not include them here.

A range can be specified in one of two ways: a bounded range has two iterators (one pointing to the beginning and one pointing just past the end), while a counted range has an iterator pointing to the beginning and an integer n indicating how many items are included. This gives us four kinds of ranges altogether:

Image

(A closed counted range must have n > 0.) As we shall see, there are different situations where bounded or counted ranges are preferable.

While mathematical texts index sequences from 1, computer scientists start from 0, and we will use the latter convention for our ranges. Interestingly, although 0-based indexing in computer science was initially used as a way to indicate the offset in memory, this convention turns out to be more natural regardless of implementation, since it means that for a sequence with n elements, the indices are in the range [0,n) and any iteration is bounded by the length.

* * *

Now we can return to the third notable feature of our distance functions: the valid_range precondition. It would be nice if we could have a valid_range function that returned true if the range specified by the two iterators was valid and false otherwise, but unfortunately, it’s not possible to implement such a function. For example, if two iterators each represent cells in a linked list, we have no way of knowing if there’s a path from one to the other. But even if we’re dealing with simple pointers, we still cannot compute valid_range: there is no way in C or C++ to determine if two pointers point to a single contiguous block of memory; there might be gaps in the middle.

So we can’t write a valid_range function, but we can still use it as a precondition. Instead of guaranteeing the correct behavior in code, we’ll use axioms that, if satisfied, ensure that our distance function will behave as intended. Specifically, we postulate the following two axioms:

Image

The first axiom says that if it’s a container, the range from begin() to end() is valid. The second axiom says that if [x,y) is a nonempty valid range, then the range [successor(x),y) is also valid. All STL-style containers, as well as C++ arrays, must obey these axioms. This allows us to prove the algorithms correct. For example, if you go back to our original distance function for input iterators in Section 10.5, you’ll see that the second axiom ensures that if we start with a valid range, we’ll still have one each time through the loop.

* * *

In addition to the successor (++) and distance operations, it’s useful to have a way to move an iterator by several positions at once. We call this function advance. As before, we’ll implement two versions, one for input iterators:

template <InputIterator I>
void advance(I& x, DifferenceType<I> n, std::input_iterator_tag) {
while (n) {
--n;
++x;
}
}

and another for random access iterators:

template <RandomAccessIterator I>
void advance(I& x, DifferenceType<I> n,
std::random_access_iterator_tag) {
x += n;
}

We’ll also provide with a top-level function for doing the dispatch:

template <InputIterator I>
void advance(I& x, DifferenceType<I> n) {
advance(x, n, IteratorCategory<I>());
}

10.7 Linear Search

Linear search is a fundamental programming task that all programmers should understand. The simplest idea of linear search is to scan a linear list until we find a specific element. But we will generalize that to a function that scans the list until it finds an element that satisfies a given predicate. So in addition to being able to search for a specific value, we could find, for example, the first odd element, or the first element that has no vowels, or whatever else we like. Of course, there might not be such an element, so we need some way to indicate that no item is found. We’ll call our function find_if—“find it if it’s there”:12

12 This name originated in the programming language Common Lisp.

template <InputIterator I, Predicate P>
I find_if(I f, I l, P p) {
while (f != l && !p(*f)) ++f;
return f;
}

This function relies on equality, dereference, and successor. If no item that satisfies the predicate exists, the returned value of f will be the same as l, the iterator that points past the end of the range. The calling function uses this comparison to determine whether a matching item has been found. C and C++ guarantee that a pointer is valid one position past the end of an array. However, such a pointer should never be dereferenced. All STL containers provide similar guarantees for their iterators.

This function has an implicit semantic precondition: the value type of the iterator and the argument type of the predicate must be the same; otherwise, there is no way to apply the predicate to the items in the range.

Here’s a variation of our linear search function for the input iterator case. Although we could have overloaded the name, we’ve deliberately added _n to emphasize that this version uses a counted range:

template <InputIterator I, Predicate P>
std::pair<I, DifferenceType<I>>
find_if_n(I f, DifferenceType<I> n, P p) {
while (n && !p(*f)) { ++f; --n; }
return {f, n};
}

Why do we return a pair? Wouldn’t it be sufficient to return the iterator that points to the found element, as we did in the previous version? No. In the previous version, the caller had the “last” iterator to compare to; here, it does not. So the second returned value tells the caller whether we’re at the end, in which case the item was not found and the returned iterator cannot be dereferenced. But just as importantly, if we do find a matching item, this allows us to restart the search where we left off. Without this, there would be no way to search a range for anything but the first occurrence of a desired item.

This illustrates an important point: just as there can be bugs in code, so there can also be bugs in the interface. We’ll see an example of one in the next section.

10.8 Binary Search

If we have a sorted sequence, we can search it much more efficiently by using another essential algorithm, binary search. This function is easy to describe, but hard to implement correctly and even harder to specify (design an interface for) correctly.


Origins of Binary Search

The idea of binary search originated in the Intermediate Value Theorem (IVT), also known as the Bolzano-Cauchy Theorem:

If f is a continuous function in an interval [a, b] such that f(a) < f(b), then ∀u Image [f(a), f(b)] there is c Image [a, b] such that u = f(c).

The proof of the theorem consists of doing binary search. Suppose, as an example, we have a continuous function such that f(a) is –3 and f(b) is 5. The IVT tells us that for a particular point in the image of the function—let’s say, 0—there is a point c in the domain such that f(c) = 0. How can we find that point? We can start by finding the point x1 that is half the distance between a and b, and computing f(x1). If it equals 0, we’re done; we’ve found c. If it’s greater than 0, we repeat with a point x2 that’s half the distance from a to x1. If it’s less, we take half the distance from x1 to b. As we keep repeating the process, we asymptotically converge to c.

Simon Stevin had a similar idea in 1594, when he devised a version of the IVT for polynomials. But since Stevin was interested in doing everything with decimals, he actually divided the interval into tenths rather than halves, and examined all of them until he found the tenth that contained the desired value. It was Lagrange who first described the binary approach for polynomials in 1795. Bolzano and Cauchy generalized the IVT in the early 19th century, and it is their version used by mathematicians today.


Binary search was first discussed as a programming technique in 1946 by physicist and computing pioneer John Mauchly, co-creator of the ENIAC (the first electronic general-purpose computer). However, many details were left unspecified. The first “correct” algorithm for performing binary search was published in 1960 by D. H. Lehmer, a mathematician who had worked on the ENIAC years earlier. However, Lehmer’s version did not use the correct interface, and this error was repeated for several decades afterward.

An example of the erroneous interface remains today in the UNIX bsearch() function. According to the POSIX13 standard:

13 POSIX is the set of standards for UNIX-like operating systems. Linux, for example, is a POSIX-compliant OS.

The bsearch() function shall return a pointer to a matching member of the array, or a null pointer if no match is found. If two or more members compare equal, which member is returned is unspecified.14

14 http://www.unix.com/man-page/POSIX/3posix/bsearch/

There are two fundamental flaws with this interface. The first concerns the return of a null pointer to indicate that the item is not found. Often you are doing the search because you want to insert an item if it isn’t there, at the place it would have been if it were there. With this interface, you have to start from scratch to find your insert position, this time using linear search! Furthermore, there are many applications where you actually want to find the closest or next value to where the missing item would be; the item you’re searching for may simply be a prefix of the kinds of items you hope to find.

The second flaw concerns the situation where there are multiple matches. The matches may be keys of items that you need to retrieve. So how do you obtain the entire range of equal items, if you don’t know which one you’re on? You have to do linear search both backward and forward to find the ends of the matching sequence.

* * *

The right way to implement binary search begins with the idea of a partition point. Assume that we have a sequence of items [f, l) that is arranged such that a certain predicate is true of the items in the range [f, m) and false for [m, l).15 Then the partition point is the position m. Formally, if we have a function that computes the partition point, then its precondition is

Image

15 In retrospect, it would have been better to have the false items first, since the Boolean value false sorts before true; unfortunately, the “wrong” order is now part of the C++ language standard.

(i.e., the elements are already partitioned as described earlier). Its postcondition is that it returns the value m from the precondition. Note that f in this expression refers to the first element in the range, not to a function.

For a counted range, the partition point algorithm looks like this:

template <ForwardIterator I, Predicate P>
I partition_point_n(I f, DifferenceType<I> n, P p) {
while (n) {
I middle(f);
DifferenceType<I> half(n >> 1);
advance(middle, half);
if (!p(*middle)) {
n = half;
} else {
f = ++middle;
n = n - (half + 1);
}
}
return f;
}

This is an extremely important algorithm, and it’s worth spending some time to make sure you understand it. It uses a binary-search-style strategy just like the Intermediate Value Theorem. Recall that the goal is to return the first “bad” element—that is, the first element for which the predicate is false. The outer loop continues until n is zero. Inside the loop, we position the iterator middle to a point halfway between f and f + n. If our predicate is false for the element at that position, we set n to half of its previous value and repeat. Otherwise, we know the predicate is true, so we set our starting point f to the next value after the middle, adjust n to reflect the number of items left, and repeat. When we are done, the return value is the partition point: the point after the last true value.

Notice that our function uses advance to move the iterator. Since we don’t know the iterator type, we can’t assume that addition is allowed. However, if we have a random access iterator, the advance function we wrote earlier will call the fast implementation. (If we don’t have a random access iterator, we might have to make as many as n moves, but in either case we’ll never have to do more than log n comparisons.)

If we are given a bounded range instead, we simply compute the distance and invoke the counted range version:

template <ForwardIterator I, Predicate P>
I partition_point(I f, I l, P p) {
return partition_point_n(f, distance(f, l), p);
}

Now let’s return to the general binary search problem. To solve it, we’re going to make use of the following lemma:

Lemma 10.1 (Binary Search Lemma): For any sorted range [i,j) and a value a (the item you’re searching for), there are two iterators, lower bound bl and upper bound bu such that

Image

where vk is the value at position k.

These bounds always exist, though in the special case where there is no matching item, bl = bu. It may help to picture the data in the range:

Image

Exercise 10.1. Prove the Binary Search Lemma.

Now we can use our partition point function to perform binary search. If we have total ordering, any sorted range is partitioned for any value a according to the predicate x < a. STL actually provides a few functions that perform binary search, depending on our task. If we want to find thefirst position where the item is found, we use the lower_bound function. Using the features in C++11, we can write lower_bound like this:

template <ForwardIterator I>
I lower_bound(I f, I l, ValueType<I> a) {
return partition_point(f, l,
[=](ValueType<I> x) { return x < a; });
}

The last line defines a new anonymous function (also known as a lambda expression16) that returns true if its argument is less than a, then passes that function as the predicate used by partition_point. The lower_bound function returns the position of the item a, or the position it would be in if it were there. ValueType is just a shorthand type function for accessing the appropriate iterator trait, like the one we wrote earlier for DifferenceType:

template <InputIterator I>
using ValueType = typename std::iterator_traits<I>::value_type;

16 See Appendix C for details on how to use lambda expressions.

In the case where the returned position is not l, we still need to know whether we found the item. To do this, the caller needs to check whether the dereferenced return value is equal to a.

If instead we want to find the last position where the item is found, we use upper_bound. The code is almost the same, except that we define the predicate to check if the value is less than or equal to a, rather than strictly less than a:

template <ForwardIterator I>
I upper_bound(I f, I l, ValueType<I> a) {
return partition_point(f, l,
[=](ValueType<I> x) {return x <= a;});
}

Some readers may be wondering which function is the “real” binary search. The answer is that it depends on the task. If you want to find the position of the first matching element, then lower_bound is the “real” binary search. If you want to find the position of the last matching element, then it’s upper_bound. If you want to know the entire range of matching elements, STL provides a third function, equal_range. And if all you care about is whether there was a match, there is a function binary_search—but keep in mind that all it’s doing is callinglower_bound and testing whether the dereferenced return value is equal to the item.

The additional functionality of equal_range clearly benefits from the STL convention of using semi-open ranges. Even when the element is not present, the function returns an empty range at the position where it could be inserted.

10.9 Thoughts on the Chapter

We began this chapter by seeing how Aristotle’s levels of abstraction (individual, species, genus) correspond to the programming notions of instance, type, and concept. It is the notion of concept that allows a generic program to work in a variety of settings.

One of your central goals as a programmer should be to identify existing concepts in your application. You will often develop new algorithms, occasionally develop a new data structure, and only rarely define a new concept. In that rare situation, a lot of work is needed to ensure that it is a true concept and not just a collection of unrelated requirements. To restate Occam’s Razor, one should not introduce new concepts without necessity.

We then introduced the concept of iterators, and saw the role they play in some fundamental algorithms. By using compile-time type dispatch on different kinds of iterators, we can ensure that the most efficient implementation gets executed in a given situation.

Finally, we saw the importance of writing not only correct code, but also correct interfaces. An incorrect interface can severely limit the utility of a function; a correct interface allows it to be used in a variety of situations without loss of efficiency.