Abstraction in Mathematics - From Mathematics to Generic Programming (2011)

From Mathematics to Generic Programming (2011)

6. Abstraction in Mathematics

Mathematicians do not study objects, but the relations between objects;
to them it is a matter of indifference if these objects are replaced by others,
provided that the relations do not change. Matter does not
engage their attention, they are interested in form alone.

Poincaré, Science and Hypothesis

The history of mathematics is filled with discoveries of new abstractions: finding ways to solve a more general problem. For example, we saw in Chapter 5 how Euler generalized Fermat’s Little Theorem so it would work with composite numbers as well as primes. Eventually, mathematicians realized that they could generalize beyond numbers, and derive results about abstract entities called algebraic structures—collections of objects that follow certain rules. This led to the development of an entirely new branch of mathematics, abstract algebra. In this chapter, we’ll introduce the first examples of these abstract entities, and prove some of their properties. As we did in the previous chapter, we’re going to put programming aside while we build the foundations we need to derive a generic algorithm in Chapter 7.

6.1 Groups

The first and most important of these algebraic structures, discovered by French mathematician Évariste Galois in 1832, is called a group.

Definition 6.1. A group is a set on which the following are defined:

operations: x Image y, x–1

constant: e

and on which the following axioms hold:


The constant e is the identity element (also sometimes written id), which is often written as 1 in multiplicative contexts. The operation x–1 is the inverse operation; applying the operation to an item and its inverse results in the identity element, as the last axiom shows. The group operation is binary, which simply means that it takes two arguments (it has nothing to do with the binary representation of numbers used in computers). The symbol Image (sometimes written *) can represent any binary operation, as long as it follows the axioms.

We often treat the group operation as multiplication, and even refer to “multiplying” two elements of a group, although what we really mean is applying the group operation, whatever it might be. Just as with multiplication, the symbol for the operation is often dropped; that is, x Image y may be written xy, and x Image x = xx = x2.

The group operation is not necessarily commutative (commutativity means that ∀x, y: x Image y = y Image x). When we want to require commutativity, we need to specify a particular kind of group:

Definition 6.2. An abelian group is a group whose operation is commutative.

One kind of abelian group is the additive group:

Definition 6.3. An additive group is an abelian group where the group operation is addition.

Additive groups are the exception to the naming conventions given earlier. For an additive group, the symbol + is used to represent its operation and 0 its identity element. Even though the name “additive group” says nothing about commutativity, it is by convention assumed to be commutative.

Groups are closed under their operation. This means that if you take any two elements of the group and apply the group operation, the result will itself be a member of the group. Similarly, they are closed under the inverse function: if you take the inverse of any element of the group, the result is still an element of the group.

Some examples of groups follow:

Additive group of integers: the elements are integers and the operation is addition.

Multiplicative group of nonzero remainders modulo 7: the elements are the numbers 1 through 6 and the operation is multiplication modulo 7.

Group of rearrangements of a deck of cards: the elements are permutations of the deck and the operation is composition of these permutations.

Multiplicative group of invertible matrices (those with nonzero determinants) with real coefficients: the elements are matrices and the operation is matrix multiplication.

Group of rotations of the plane: the elements are different rotations around the origin and the operation is composition of these rotations.

Note that integers do not form a multiplicative group, because the multiplicative inverse of most integers is not an integer. In other words, integers are not closed under multiplicative inverse.

Exercise 6.1. How many integers have multiplicative inverses that are integers? What are they?

Let’s look at one of the examples in a bit more detail. In Chapter 5, we looked at the multiplication table for integers modulo 7:


The unique values in the table—the set {1, 2, 3, 4, 5, 6}—are also called “nonzero remainders modulo 7,” and as we noted earlier, these form a multiplicative group. What does that mean? Since it’s a multiplicative group, the group operation is multiplication, and its identity element is 1. We can see from the first row and column of the table that 1 is the identity, because the product of any element x and 1 is x.

Since groups are closed under their operation, if we multiply any two members of the group, we get another member of the group. For example:


Associativity and commutativity of modular multiplication follows from associativity and commutativity of integer multiplication. The commutativity or abelianness of the group is evident by observing that the multiplication table is symmetric with respect to the main diagonal.

Since groups are closed under inverse, if we take the inverse of any member of the group, we get another member of the group. (Recall that the inverse of an element x is the element that produces 1 when multiplied by x. From the multiplication table, you can see the pairs of inverses by looking at the rows and columns whose cells contain 1s.) For example:

2–1 = 4 mod 7

4–1 = 2 mod 7

5–1 = 3 mod 7

Évariste Galois (1811–1832)


The concept of groups started with the work of Évariste Galois, a young French college dropout involved in a revolutionary movement, and the most romantic figure in the history of mathematics.

In the early 19th century, a romantic spirit spread through Europe; young people idolized the English poet Byron, who died fighting for Greek independence, and others who were willing to give their lives for a cause. They remembered Napoleon not as a tyrant, but as a young hero who abolished feudalism throughout Europe.

Paris in the early 1830s was aflame with revolutionary activity. Galois, who was a bohemian hothead, joined the revolutionary movement. As a romantic rebel, Galois did not follow the conventional path through a university education. After failing to be admitted to one school and being expelled from another, he studied mathematics on his own, becoming an expert on Lagrange’s theory of polynomials. He served brief prison sentences for various protest activities, such as marching through the streets in the uniform of a banned national guard unit while carrying several loaded weapons—but kept doing mathematics while in prison.

At age 20, Galois, defending the honor of a woman whom he apparently barely knew, issued a challenge (or was challenged) to a duel. The night before the duel, certain of his impending death, he wrote a long letter to a friend describing his mathematical ideas. This manuscript contained the seeds of the theory of groups, fields, and their automorphisms (mappings onto themselves). These ideas laid the foundations for a major new field of mathematics, abstract algebra. According to mathematician Hermann Weyl, “This letter, if judged by the novelty and profundity of ideas it contains, is perhaps the most substantial piece of writing in the whole literature of mankind.”

The next day, Galois fought the duel and died as a result of his wounds. It is ironic that while he only played at being a revolutionary in politics, he was a true revolutionary in mathematics.

6.2 Monoids and Semigroups

In some situations, we are interested in algebraic structures that have fewer requirements than groups. (We’ll see some of these applications in the next chapter.) For example, there are times when we don’t need to require an inverse operation, but we want to maintain the other properties of a group. This is called a monoid. More formally:

Definition 6.4. A monoid is a set on which the following are defined:

operation : x Image y

constant : e

and on which the following axioms hold:


This definition is literally the same as the one for groups, except we’ve left out the inverse operation and the axiom of cancellation that uses it. As with groups, we can define particular kinds of monoids by specifying the operation, such as an additive monoid (where the operation is addition) and a multiplicative monoid (where the operation is multiplication).

Some examples of monoids follow:

Monoid of finite strings (free monoid): the elements are strings, the operation is concatenation, and the identity element is the empty string.

Multiplicative monoid of integers: the elements are integers, the operation is multiplication, and the identity element is 1.

We can relax the requirements even further by dropping the identity element. This is called a semigroup:

Definition 6.5. A semigroup is a set on which the following is defined:

operation : x Image y

and on which the following axiom holds:


Again, all we’ve done is taken the previous definition and removed something—in this case, the requirement that there be an identity element, and the axiom that uses it. And as we did with groups and monoids, we can define additive semigroups and multiplicative semigroups as semigroups that use those operations.

Some examples of semigroups follow:

Additive semigroup of positive integers: the elements are positive integers and the operation is addition.

Multiplicative semigroup of even integers: the elements are even integers and the operation is multiplication.

As we mentioned earlier, mathematicians write repeated applications of the semigroup, monoid, or group operation with the same conventions as ordinary multiplication. For example:

x Image x Image x = xxx = x3

More formally, we define raising a semigroup to a power as follows:


Exercise 6.2. Why can’t we define power for semigroups starting with n = 0?

Equation 6.1 shows the semigroup operation happening on the left side of the power (i.e., it says xxn–1, not xn–1x). What if we wanted to write the expansion the other way? We can do that, too, as we will now prove:

Lemma 6.1: For n ≥ 2, xxn–1 = xn–1 x.

Proof. We prove this by induction.1

Basis: n = 2. It’s obviously true, because

xx1 = xx = x1x

1 For a refresher on this proof technique, see Appendix B.2.

As the induction hypothesis, we assume the statement is true for n = k – 1:

xx(k – 1)–1 = xxk–2 = xk–2x = x(k–1)–1x

and then derive the result for n = k:



Even though a semigroup guarantees only associativity and not commutativity of its operation, it turns out that powers of a given semigroup element always commute, which we can also prove by generalizing this result. This is perhaps the most important theorem on semigroups.

Theorem 6.1 (Commutativity of Powers): xnxm = xmxn = xn+m

Proof. Proof by induction on m.

Basis: m = 1:


Inductive step: Assume true for m = k. Show for m = k + 1:


We have shown that xnxm = xn+m. Therefore, it is also true that xmxn = xm+n. Since integer addition is commutative, xn+m = xm+n, so xnxm = xmxn.


A semigroup is the weakest interesting algebraic structure. The only requirement left to relax is the associativity axiom. An algebraic structure called magma drops even that axiom, but it turns out not to be very useful. Since there are no axioms left, no theorems can be proved.

6.3 Some Theorems about Groups

Now let’s return to groups and look at some of their properties.

An important observation is that all groups are transformation groups. In other words, every element a of the group G defines a transformation of G onto itself:

x → ax

For example, with the additive group of integers, if we choose a = 5, then this acts as a “+5” operation, transforming the set of elements x to the set x+ 5. These transformations are one-to-one because of our invertibility axiom:

a–1(ax)→ x

In our example, we can undo our +5 transformation by applying the inverse, –5.

Theorem 6.2: A group transformation is a one-to-one correspondence.2

2 A one-to-one correspondence between two sets is a mapping between them that is both one-to-one and onto.

This is equivalent to saying that for any finite set S of elements of group G and any element a of G, a set of elements aS has the same number of elements as S.

Proof. If S = {s1,..., sn}, then aS = {as1, ..., asn}. We know that the set aS can’t contain more unique elements than S, but could it contain fewer? (That would be the case if two of the elements in S were mapped to the same element in aS.) Suppose two elements of aS were the same: asi = asj. Then


So if two results of the transformation ask are equal, then their inputs sk must be equal. Equivalently (by the contrapositive of the previous statement), if the inputs are not equal, then the results of the transformation must not be equal. Since we started with n distinct arguments, we will have n distinct results. In other words, the set aS has the same number of elements as S.


Here are a few more simple results about groups:

Theorem 6.3: There is a unique inverse for every element.

ab = e Image b = a–1

Proof. Suppose ab = e. Then we can multiply both sides by a–1 on the left, like this:



Theorem 6.4: The inverse of a product is the reversed product of the inverses.

(ab)–1 = b–1a–1

Proof. The two expressions are equal if and only if multiplying one by the inverse of the other yields the identity element. We’ll use the inverse of (ab)–1, which by definition is (ab), and multiply it by b–1a–1:



Theorem 6.5: The power of an inverse is the inverse of the power.

(x–1)n = (xn)–1

Proof by induction. Basis: n = 1

(x−1)1 = x−1 = (x1)−1

Inductive step: Assume true for n = k − 1; that is, (x−1)k−1 = (xk−1)−1. Then prove for n = k.

We want to show that (x–1)k = (xk)–1; that is, the inverse of xk is (x–1)k. If that’s true, then when we multiply them together, we should get the identity element. Using the definition of power and the commutativity of powers theorem, we’ll rewrite xk as xk – 1x and (x–1)k as x–1(x–1)k–1, regroup the terms to get some to cancel, and then substitute using our inductive assumption:



Therefore (xn)–1 = (x–1)n.

Exercise 6.3 (very easy). Prove that any group has at least one element.

Definition 6.6. If a group has n > 0 elements, n is called the group’s order. If a group has infinitely many elements, its order is infinite.

There is also the notion of the order of an element in a group:

Definition 6.7. An element a has an order n > 0 if an = e and for any 0 < k < n,ake. (In other words, the order of a is the smallest power of a that produces e.) If such n does not exist, a has an infinite order.

Exercise 6.4 (very easy). What is the order of e? Prove that e is the only element of such order.

* * *

We now come to an important theorem about groups:

Theorem 6.6: Every element of a finite group has finite order.

Proof. If n is an order of the group, then for any element a, {a, a2, a3,..., an+1} has at least one repetition ai and aj. Let us assume that 1 ≤ i < j ≤ n + 1, ai is the first repeated element and aj is its first repetition. Then



and j – i > 0 is the order of a.

This proof uses a version of the pigeonhole principle. (To learn more about the pigeonhole principle and how to use it, see Appendix B.3.) The result guarantees that this simple algorithm for computing the order of an element will terminate: just keep multiplying by itself until you get e.

Exercise 6.5. Prove that if a is an element of order n, then a–1 = an–1.

6.4 Subgroups and Cyclic Groups

Definition 6.8. A subset H of a group G is called a subgroup of G if


In other words, to be a subgroup, H must be a subset and a group. Associativity of the operation on G implies its associativity on H, so we do not need to explicitly state the associativity requirement in the definition; by the same reasoning, we do not need to explicitly restate the identity and cancellation axioms.

For example, the additive group of even numbers is a subgroup of the additive group of integers; so is the additive group of numbers divisible by 5.

Some groups have many subgroups, but almost all groups have at least two: the group itself and the group consisting of just the element e. These two subgroups are called trivial subgroups. (The only group that doesn’t have at least two subgroups is the group that consists of just the identity element.)

Let’s return again to the multiplicative group {1, 2, 3, 4, 5, 6} of nonzero remainders modulo 7 and its multiplication table:


Our group has four multiplicative subgroups:

{1}, {1, 6}, {1, 2, 4}, {1, 2, 3, 4, 5, 6}

How can we tell? To be a subgroup, each one first needs to be a subset of the original group. Each of these is obviously a subset of {1,2, 3, 4, 5, 6}, since all of their elements are contained in the larger set. Next, each subset also contains 1 (the identity element) and is closed under its operation (multiplication modulo 7) and its inverse operation.

For example, consider the set {1, 2, 4}: if we multiply any element of the set by itself or any other member of the set any number of times (mod 7), we will still get a result in the set.

Exercise 6.6. Find orders of every element of:

• The multiplicative group of remainders mod 7

• The multiplicative group of remainders mod 11

* * *

The simplest kind of groups are cyclic groups.

Definition 6.9. A finite group is called cyclic if it has an element a such that for any element b, there is an integer n where

b = an

In other words, every element in the group can be generated by raising one particular element to different powers. Such an element is called a generator of the group; a group may have multiple generators. The additive group of remainders modulo n is an example of a cyclic group.

In our previous example of remainders modulo 7, we can tell that the generators are 3 and 5, because they are not in any nontrivial subgroup of the original group.

Exercise 6.7. Prove that any subgroup of a cyclic group is cyclic.

Exercise 6.8. Prove that a cyclic group is abelian.

Lemma 6.2: Powers of a given element in a finite group form a subgroup.

In other words, every element of a finite group is contained in a cyclic subgroup generated by this element.

Proof. For a set to be a subgroup, it must be a nonempty subset and it must be a group. To be a subset, it must be closed under the group operation. We know it’s closed under the operation because the product of two powers is a power. To be a group, its operation needs to be associative, it must contain the identity element, and it must have an inverse operation. We know the operation is associative, because it’s the same operation from the original group. We know the set of powers of a given element has an identity element, since we showed earlier (Theorem 6.6) that every element of a finite group has finite order.3 And we know that they have an inverse, because for every power ak, an–k is its inverse, where n is the order of a.


3 Recall that part of the definition of an element a having a finite order is that an = e.

6.5 Lagrange’s Theorem

One of the remarkable things about abstract algebra is that we can prove results for structures such as groups without knowing anything about either the specific items in the group or the operation. To see an important example of this, we will start by proving a few simple results about cosets.

Definition 6.10. If G is a group and HG is a subgroup of G, then for any a Image G the (left) coset of a by H is a set

aH = {g Image G | ∃h Image H: g = ah}

In other words, a coset aH is a set of all elements in G obtainable by multiplying elements of H by a.

As an example, consider the additive group4 of integers Image and its subgroup, integers divisible by 4, 4Image. (The use of Image as a symbol for integers comes from the German word Zahlen, which means “numbers.”) It has four distinct cosets: 4n, 4n + 1, 4n + 2, and 4n + 3. Adding other integers will result only in values that are already in one of these cosets; for example, the coset 4n + 5 will contain the same elements as the coset 4n + 1. (The left and right cosets are the same, since integer addition is commutative.)

4 Remember, in an additive group, the role of group “multiplication” is played by addition. So the coset aH consists of elements of G obtainable by adding a to elements of H.

Lemma 6.3 (Size of Cosets): In a finite group G, for any of its subgroups H, the number of elements in a coset aH is the same as the number of elements in the subgroup H.

Proof. We already proved the one-to-one correspondence of transformations aS when S is a subset of G. Since a subgroup is by definition also a subset, then we know the mapping from H to aH is also a one-to-one correspondence. If there is a one-to-one correspondence between two finite sets, they are the same size.


Lemma 6.4 (Complete Coverage by Cosets): Every element a of a group G belongs to some coset of subgroup H.

Proof. a Image aH. That is, every element a belongs to the coset aH generated by itself, because H, being a subgroup, contains the identity element.


Lemma 6.5 (Cosets are either disjoint or identical): If two cosets aH and bH in a group G have a common element c, then aH = bH.

Proof. Suppose the common element c is aha in one coset and bhb in the other coset.

aha = bhb

Multiplying both sides on the right by Image, we get


The term on the right is b times something from H (we know it’s from H because hb is from H and Image is from H, and since H is a subgroup, it is closed under multiplication). Now let’s multiply both sides on the right by x, an arbitrary element from H:


We know by definition that ax is in the coset aH. We also know that the term on the right is b times something from H, so it’s also in the coset bH. Since we can do this for any x in H, aHbH. We can then repeat the process from the beginning, this time using Image instead of Image, to show that bHaH. So bH = aH.


With these results, we can state an important theorem in group theory, which illustrates the power of abstract reasoning. If you want to learn one theorem in group theory, this is the one. While it’s very simple to state, it provides the foundation for the theory of finite groups.

Theorem 6.7 (Lagrange’s Theorem): The order of a subgroup H of a finite group G divides the order of the group.


1. The group G is covered by cosets of H. (Proved earlier, Lemma 6.4)

2. Different cosets are disjoint. (Proved earlier, Lemma 6.5)

3. They are of the same size n, where n is the order of H. (Proved earlier, Lemma 6.3. Specifically, the result says that |aH| = |H| for any a, so every coset has size |H| and therefore the same size as every other coset.)

Therefore the order of G is nm, where m is the number of distinct cosets, which means that the order of G is a multiple of the order of H. In other words, the order of H divides the order of G.


As an example, suppose a group G has two distinct cosets of its subgroup H. Then every element of G must be covered by one or the other (but not both) of those cosets, so the order of H must be |G|/2.

Interestingly, the converse of Lagrange’s Theorem is not true: in a group of order n, not every divisor of n will have a subgroup of corresponding order.

Joseph-Louis Lagrange (1736–1813)


The leading mathematician in Europe at the end of the 18th century was Joseph-Louis Lagrange, who was a successor of Leonhard Euler both in terms of intellectual leadership and as holder of a position as director of mathematics at the Prussian Academy of Sciences in Berlin.

Lagrange was born Giuseppe Luigi Lagrancia in Turin, in the northern Italian region of Piedmont. As his original name suggests, he grew up speaking Italian, although his family claimed French ancestry. Lagrange discovered mathematics largely on his own while a student in Turin, and within a few years became an instructor and began publishing his work.

Sometime around age 20, he began corresponding with Euler, who was in Berlin at the time. Impressed by Lagrange’s work, Euler acted as a mentor to the younger mathematician, encouraging him and promoting his discoveries. Euler began lobbying to bring Lagrange to Berlin, but by the time this plan came to fruition in 1766, Euler had moved back to Russia. With Euler gone, Lagrange was appointed to his mentor’s former position, and soon established himself as the second most prominent mathematician in Europe.

Lagrange spent the next 20 years in Berlin, and did his most important work during this period, contributing to several areas of mathematics and physics. His book Analytical Mechanics, perhaps one of the 10 most important books in the history of mathematics, described a top-down approach to solving mechanical problems that was more general than using Newtonian mechanics. Much of modern physics relies on Lagrange’s work. He also did extensive work on polynomial equations, realizing that coefficients could be expressed as functions of a polynomial’s roots, and laid the groundwork for a later breakthrough by Galois. And in number theory, Lagrange figured out when continued fractions will be periodic, complementing Euler’s earlier work on the subject.

After the death of his patron and friend, King Friedrich II of Prussia, Lagrange was approached by the French ambassador (acting on orders from King Louis XVI) to convince the great mathematician to move to France. Lagrange agreed, and would live in Paris from 1786 until his death.

Despite his prominence, Lagrange was a shy, unpretentious man with few friends and little social life. He was prone to bouts of depression, and made little progress on his work for years at a time, particularly during his first years in France. However, the French Revolution revived his interest—although he often feared he would need to flee the country due to the revolutionaries’ expulsion of most foreigners. He became involved in efforts to come up with a new system of weights and measures, and was one of a committee of five prominent scientists who voted to approve what we now call the metric system. Lagrange also began teaching again, although by many accounts he was not popular with students, who had trouble understanding both his ideas and his Italian accent.

In his later years, Lagrange was favored by the new emperor Napoleon Bonaparte—himself quite an accomplished mathematician. Napoleon recognized Lagrange’s genius, and made him a Count of the Empire.

* * *

Now we’ll prove a couple of corollaries to Lagrange’s Theorem:

Corollary 6.7.1: The order of any element in a finite group divides the order of the group.

Proof. The powers of an element of G form a subgroup of G. Since the order of an element is the order of the subgroup, and since the order of the subgroup must divide the order of the group, then the order of the element must divide the order of the group.


(Reminder: The order of an element is equal to the order of the cyclic group of its powers.)

Corollary 6.7.2: Given a group G of order n, if a is an element of G, then an = e.

Proof. If a has an order m, then m divides n (by the previous corollary), so n = qm. am = e (by definition of order of an element). Therefore (am)q = e and an = e.


Note that this doesn’t say that the order of a is n; it could be smaller.

Lagrange’s Theorem lets us easily prove some important results from Chapter 5 in much simpler fashion than we did the first time:

Fermat’s Little Theorem: If p is prime, ap–1 – 1 is divisible by p for any 0 < a < p.

Proof. Let us take the multiplicative group of remainders modulo p. It contains p – 1 nonzero remainders. Since p – 1 is the order of the group, it follows immediately from Corollary 6.7.2 that

ap–1 = e

Since the identity element for a multiplicative group is 1 (specifically, 1 mod p in our group of remainders), we have

ap–1 = 1 mod p

ap–1 – 1 = 0 mod p

which is what it means to be divisible by p.


Euler’s Theorem: For 0 < a < n, where a and n are coprime, aφ(n) – 1 is divisible by n.

Proof. Let us take the multiplicative group of invertible remainders modulo n. Since φ(n) is by definition the number of coprimes, and every coprime is invertible, φ(n) gives us the order of the group. It follows immediately from Corollary 6.7.2 that

aφ(n) = e

aφ(n) = 1 mod n

or equivalently

aφ(n) – 1 = 0 mod n


The logic is exactly the same as the previous proof.

Exercise 6.9 (very easy). What are subgroups of a group of order 101?

Exercise 6.10. Prove that every group of prime order is cyclic.

6.6 Theories and Models

Groups, monoids, and semigroups are examples of what mathematicians call theories. People use the word “theory” in many ways, often meaning something like “conjecture”—that is, an unproven explanation. But to mathematicians, “theory” has a very specific meaning, which does not include this sense of being unproven.

Definition 6.11. A theory is a set of true propositions.

From now on, when we use the word “theory,” we’ll mean this specific mathematical sense. Here are some important facts about theories:

• A theory can be generated by a set of axioms plus a set of inference rules.

• A theory is finitely axiomatizable if it can be generated from a finite set of axioms.

• A set of axioms is independent if removing one will decrease the set of true propositions.

• A theory is complete if for any proposition, either that proposition or its negation is in the theory.

• A theory is consistent if for no proposition it contains both that proposition and its negation.

Let’s look at a specific example: the notion of groups that we’ve been discussing throughout the chapter. A group is a theory in the sense we’ve just defined—specifically, a theory that, given operations x Image y and x–1 and the identity element e, has these axioms:

x Image (y Image z) = (x Image y) Image z

x Image e = e Image x = x

x Image x–1 = x–1 Image x = e

Starting with these, we can derive any number of true propositions (theorems), such as

x Image y = x Image y = e

(x Image y)–1 = y–1 Image x–1

We aren’t enumerating all the true propositions that constitute the theory of groups. Rather, we’re generating the propositions by deriving them from the axioms and from previously proven propositions. For example, we can prove the first theorem by multiplying both sides of the equation by x–1. Like basis vectors in linear algebra, axioms form a basis for the theory. Also like the linear algebra case, we can have more than one basis for the same theory.

* * *

Closely associated with the notion of a theory is that of a model. Again, the mathematical meaning is quite different from the everyday meaning:

Definition 6.12. A set of elements where all the operations in the theory are defined and all the propositions in the theory are true is called its model.

In a sense, a model is a particular implementation of a theory. A model gives you the specific set of elements; a theory does not. Just as there can be multiple implementations of, say, an algorithm, so there can be multiple models of a theory. For example, the additive group of integers and the multiplicative group of nonzero remainders modulo 7 are both models of the theory of abelian groups.

The more5 propositions there are in a theory, the fewer different models there are. If we generate propositions from axioms and inference rules, then fewer axioms means fewer propositions, and therefore more models. This makes sense intuitively: axioms and propositions are constraints on a theory; the more of them you have, the harder it is to satisfy all of them, so the fewer models there will be that do so.

5 The notion of “more” that we’re interested in here is that of additional. If the set of propositions for theory A contains all the propositions of theory B, plus some additional ones, then we say A has more, even if both have a countably infinite number of propositions.

Conversely, the more models there are for a theory, the fewer propositions there are. If there are more ways to do something, there must be fewer constraints on how you can do it.

Definition 6.13. Two models are isomorphic if there is a one-to-one correspondence between them that preserves their operations. This means we can apply the mapping (or its inverse) before or after the operation and we’ll get the same result.


For example, we can map natural numbers to even natural numbers with the mapping “multiply by 2,” and use addition as our operation. If we add two natural numbers and then apply the mapping (i.e., multiply by 2), we get the same result as if we first multiply by 2 and then add the numbers:


An isomorphism of a model with itself is called an automorphism.

Definition 6.14. A (consistent) theory is called categorical or univalent if all of its models are isomorphic.6

6 This is the original definition by Oswald Veblen. Modern logicians talk about κ-categorical theories: all models of the cardinality κ are isomorphic. Full treatment of modern model theory is well outside the scope of this book.

An inconsistent theory is one that has no models—there’s no way to satisfy all the propositions without a contradiction.

Categorical Theories versus STL

For a long time, people believed that only categorical theories were good for programming. When the C++ Standard Template Library (STL) was first proposed, some computer scientists opposed it on the grounds that many of its fundamental concepts, such as Iterator, were underspecified. In fact, it is this underspecification that gives the library its generality. While linked lists and arrays are not computationally isomorphic, many STL algorithms are defined on input iterators and work on both data structures. If you can get by with fewer axioms, you allow for a wider range of implementations.

6.7 Examples of Categorical and Non-categorical Theories

Let’s look at an example of a categorical theory with two isomorphic models: cyclic groups of order 4. The first model is Image, the additive group of remainders modulo 4 (consisting of {0, 1, 2,3}); the second model is (Image), the multiplicative group of nonzero remainders modulo 5 (consisting of {1, 2, 3, 4}). These groups have the following “multiplication” tables:


Even though the numbers are different, these two models are isomorphic; we could map elements of one to elements of the other. In theory, there are 4! = 12 possible mappings—0 in the first model could map to 1, 2, 3, or 4 in the second, then 1 could map to the three remaining choices, and so on. But in this case, the actual number of possibilities is much smaller.

We observe that in the first model, the values 1 and 3 are generators of the group—we could start with either of them, raise it to a power using the group operation, and get all the other elements. In the second model, 2 and 3 are generators. This helps us narrow the choices: a generator from one group has to map to a generator from the other group, which in this case gives us two different mappings. For example, we can say, “the role of 1 in the first model is played by 2 in the second model, and 3 in the first model corresponds to 3 in the second.”

What about the other two values? We know from the multiplication tables that 0 in the first model is the identity element, a role played by 1 in the second model. Finally, we know that 2 in the first model maps to 4 in the second model, because in both cases it is the only non-identity element that gives identity when applied to itself.

So we have two possible mappings:


How do we know these mappings produce our second model? One way is to see if we can use them to transform the Image multiplication table into the (Image) multiplication table. Let’s try it using the second mapping.

First, we replace the values from the Image table with the mapped values. That gives us:


Then we permute the rows and columns so the headers are in the right order. We’ll start by swapping the last two columns and the last two rows (shaded in the preceding table):


Finally, we’ll swap the middle two rows and columns:


This is exactly the multiplication table for (Image) we showed on p. 105, which is what we wanted.

* * *

Now let’s look at an example of a non-categorical theory: all groups of order 4. While there is only one non-isomorphic group of order 1, 2, or 3, there are two non-isomorphic groups of order 4: the cyclic group Image, which we just described, and a group called the Klein group. There are two important models of the Klein group: the multiplicative group of units modulo 8 (consisting of {1, 3, 5, 7}) and the group of isometries transforming a rectangle into itself (the identity transform, vertical symmetry, horizontal symmetry, and 180° rotation).

These two kinds of groups have the following multiplication tables. Since we don’t know the individual elements of a theory, this time we’ll write them using e to mean the identity element and a,b, and c to mean their other elements:


In the table on the left, addition is our operation and we can think of “e” as being 0 (the additive identity) and a, b, and c as representing the integers 1, 2, and 3. So, for example, a Image b = 1 + 2 = 3 = c; therefore the value at row a and column b is c.

Are these two groups really different (i.e., not isomorphic), or is there a way to transform one into the other, as we did before? In other words, is there a distinguishing proposition for groups of order 4? Yes, the proposition

x Image G : x2 = e

is true for the Klein group but false for Image. We can see this by looking at the diagonal of the multiplication table. Another way to see that they are different is to notice that the cyclic group contains two generators, while the Klein group contains none.

6.8 Thoughts on the Chapter

In this chapter we introduced the idea of algebraic structures, abstract sets of elements that obey certain properties. We looked at groups, the most important of these structures, and their weaker cousins monoids and semigroups. These are summarized in the following table, which we’ll be adding to later:


Each row of the table includes all the properties of the previous row, with one or more additions. We can view the relationships between the structures like this:


For example, the diagram shows that a monoid is a semigroup that also has an identity element (and the identity axiom).

A few additional structures we talked about are most easily defined in terms of others:


We also saw how we could derive properties of groups (like the one stated by Lagrange’s Theorem) without knowing anything about the particular elements being manipulated. In other words, we saw how to derive results about theories without specifying a particular model. Now we’re ready to put these algebraic structures to practical use.