More Advanced Data Structures - THE RUBY WAY, Third Edition (2015)

THE RUBY WAY, Third Edition (2015)

Chapter 9. More Advanced Data Structures

A graphic representation of data abstracted from the banks of every computer in the human system. Unthinkable complexity. Lines of light ranged in the nonspace of the mind, clusters and constellations of data. Like city lights, receding.

—William Gibson

There are, of course, more complex and interesting data structures than arrays, hashes, and their cousins. Some of the ones we’ll look at here have direct or indirect support in Ruby; others are “roll-your-own” for the most part. Fortunately, Ruby simplifies the construction of custom data structures.

The mathematical set can be dealt with by means of arrays, as we’ve already seen. But recent versions of Ruby have a Set class that serves this purpose well.

Stacks and queues are two common data structures in computer science. I have outlined a few of their general uses here. For more detail, there are many first-year computer science books with extensive explanations.

Trees are useful from the perspective of sorting, searching, and simply representing hierarchical data. We cover binary trees here, with a few notes about multiway trees.

The generalization of a tree is a graph, which is simply a collection of nodes joined by edges that may have weights or directions associated with them. These are useful in many different areas of problem solving, such as networking and knowledge engineering.

Since sets are the easiest topic, we’ll look at them first.

9.1 Working with Sets

We’ve already seen how certain methods of the Array class let it serve as an acceptable representation of a mathematical set. But for a little more rigor and a little tighter coding, Ruby has a Set class that hides more of the detail from the programmer.

A simple require makes the Set class available:

require 'set'

This also adds a to_set method to Enumerable so that any enumerable object can be converted to a set.

Creating a new set is easy. The [] method works much as it does for hashes. The new method takes an optional enumerable object and an optional block. If the block is specified, it is used as a kind of “preprocessor” for the list (like a map operation):

s1 = Set[3,4,5] # {3,4,5} in math notation
arr = [3,4,5]
s2 = Set.new(arr) # same
s3 = Set.new(arr) {|x| x.to_s } # set of strings, not numbers

9.1.1 Simple Set Operations

Union is accomplished by the union method (aliases are | and +):

x = Set[1,2,3]
y = Set[3,4,5]

a = x.union(y) # Set[1,2,3,4,5]
b = x | y # same
c = x + y # same

Intersection is done by intersection (aliased as &):

x = Set[1,2,3]
y = Set[3,4,5]
a = x.intersection(y) # Set[3]
b = x & y # same

The binary set operators don’t have to have a set on the right side. Any enumerable will work, producing a set as a result.

The unary minus is set difference, as we saw in the array discussion (refer to Section 8.1.9, “Using Arrays as Mathematical Sets”):

diff = Set[1,2,3] - Set[3,4,5] # Set[1,2]

Membership is tested with member? or include?, as with arrays. Remember the operands are “backwards” from mathematics.

Set[1,2,3].include?(2) # true
Set[1,2,3].include?(4) # false
Set[1,2,3].member?(4) # false

We can test for the null or empty set with empty?, as we would an array. The clear method will empty a set regardless of its current contents.

s = Set[1,2,3,4,5,6]
s.empty? # false
s.clear
s.empty? # true

We can test the relationship of two sets: Is the receiver a subset of the other set? Is it a proper subset? Is it a superset?

x = Set[3,4,5]
y = Set[3,4]

x.subset?(y) # false
y.subset?(x) # true
y.proper_subset?(x) # true
x.subset?(x) # true
x.proper_subset?(x) # false
x.superset?(y) # true

The add method (alias <<) adds a single item to a set, normally returning its own value; add? returns nil if the item was already there. The merge method is useful for adding several items at once. All these potentially modify the receiver, of course. The replace method acts as it does for a string or array.

Finally, two sets can be tested for equality in an intuitive way:

Set[3,4,5] == Set[5,4,3] # true

9.1.2 More Advanced Set Operations

It’s possible, of course, to iterate over a set, but (as with hashes) do not expect a sensible ordering because sets are inherently unordered, and Ruby does not guarantee a sequence. (You may even get consistent, unsurprising results at times, but it is unwise to depend on that fact.)

s = Set[1,2,3,4,5]
puts s.each.first # may output any set member

The classify method is like a multiway partition method; in other words, it is the rough equivalent of the Enumerable method called group_by.

files = Set.new(Dir["*"])
hash = files.classify do |f|
if File.size(f) <= 10_000
:small
elsif File.size(f) <= 10_000_000
:medium
else
:large
end
end

big_files = hash[:large] # big_files is a Set

The divide method is similar, but it calls the block to determine “commonality” of items, and it results in a set of sets.

If the arity of the block is 1, it will perform calls of the form block.call(a) == block.call(b) to determine whether a and b belong together in a subset. If the arity is 2, it will perform calls of the form block.call(a,b)to determine whether these two items belong together.

For example, the following block (with arity 1) divides the set into two sets, one containing the even numbers and one containing the odd ones:

require 'set'
numbers = Set[1,2,3,4,5,6,7,8,9,0]
set = numbers.divide{|i| i % 2}
p set # #<Set: {#<Set: {1, 3, 5, 7, 9}>, #<Set: {2, 4, 6, 8, 0}>}>

Here’s another contrived example. Twin primes are prime numbers that differ by 2 (such as 11 and 13); singleton primes are the others (such as 23). The following example separates these two groups, putting pairs of twin primes in the same set with each other. This example uses a block with arity 2:

primes = Set[2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31]
set = primes.divide{|i,j| (i-j).abs == 2}
# set is #<Set: { #<Set: {2}>, #<Set: {3, 5, 7}>, #<Set: {11, 13}>,
# #<Set: {17, 19}>, #<Set: {23}>, #<Set: {29, 31}> }>

As I said in the previous section, it’s important to realize that the Set class doesn’t always insist that a parameter or operand has to be another set. (Refer to the discussion of duck typing in Chapter 1, “Ruby in Review,” if this confuses you.) In fact, most of these methods will take any enumerable object as an operand. Consider this a feature.

Other incidental methods can be applied to sets (including all methods in Enumerable). I choose not to cover things such as flatten here; for more information, consult ruby-doc.org or any other Ruby API reference.

9.2 Working with Stacks and Queues

Stacks and queues are the first entities we have discussed that are not strictly built in to Ruby. By this we mean that Ruby does not have Stack and Queue classes as it does Array and Hash classes (except for the Queue class in thread.rb, which we’ll examine later).

And yet, in a way, they are built in to Ruby after all. In fact, the Array class implements all the functionality we need to treat an array as a stack or a queue. We’ll see this in detail shortly.

Although it is possible to use an Array as a stack or a queue, the Array class in Ruby is not thread-safe. This means that any array is vulnerable to multiple threads modifying it simultaneously and thus corrupting its data. When sharing objects between threads, be sure to use the Queueclass, immutable data objects, or some other thread-safe data structure.

A stack is a last-in, first-out (LIFO) data structure. The traditional everyday example is a stack of cafeteria trays on its spring-loaded platform; trays are added at the top and also taken away from the top.

A limited set of operations can be performed on a stack. These include push and pop (to add and remove items) at the very least; usually there is a way to test for an empty stack, and there may be a way to examine the top element without removing it. A stack implementation never provides a way to examine an item in the middle of the stack.

You might ask how an array can implement a stack given that array elements may be accessed randomly, and stack elements may not. The answer is simple. A stack sits at a higher level of abstraction than an array; it is a stack only so long as you treat it as one. The moment you access an element illegally, it ceases to be a stack.

Of course, you can easily define a Stack class so that elements can only be accessed legally. We will show how this is done.

It is worth noting that many algorithms that use a stack also have elegant recursive solutions. The reason for this becomes clear with a moment’s reflection. Function or method calls result in data being pushed onto the system stack, and this data is popped upon return. Thus, a recursive algorithm simply trades an explicit user-defined stack for the implicit system-level stack. Which is better? That depends on how you value readability, efficiency, and other considerations.

A queue is a first-in, first-out (FIFO) data structure. It is analogous to a group of people standing in line at (for example) a movie theater. Newcomers go to the end of the line, while those who have waited the longest are the next served. In most areas of programming, queues are probably used less often than stacks.

Queues are useful in more real-time environments where entities are processed as they are presented to the system. They are useful in producer-consumer situations (especially where threads or multitasking are involved). A printer queue is a good example; print jobs are added to one end of the queue, and they “stand in line” until they are removed at the other end.

The two basic queue operations are usually called enqueue and dequeue in the literature. The corresponding instance methods in the Array class are called unpush and shift, respectively.

Note that unshift could serve as a companion for shift in implementing a stack, not a queue, because unshift adds to the same end from which shift removes. Various combinations of these methods could implement stacks and queues, but we will not concern ourselves with all the variations.

That ends our introductory discussion of stacks and queues. Now let’s look at some examples.

9.2.1 Implementing a Stricter Stack

We promised earlier to show how a stack could be made “idiot proof” against illegal access. We may as well do that now. Here is a simple class that has an internal array and manages access to that array. (There are other ways of doing this—by delegating, for example—but what we show here is simple and works fine.)

class Stack

def initialize
@store = []
end

def push(x)
@store.push x
end

def pop
@store.pop
end

def peek
@store.last
end

def empty?
@store.empty?
end

end

We have added one more operation that is not defined for arrays; peek simply examines the top of the stack and returns a result without disturbing the stack.

Some of the rest of our examples assume this class definition.

9.2.2 Detecting Unbalanced Punctuation in Expressions

Because of the nature of grouped expressions such as parentheses and brackets, their validity can be checked using a stack. For every level of nesting in the expression, the stack will grow one level higher; when we find closing symbols, we can pop the corresponding symbol off the stack. If the symbol does not correspond as expected, or if there are symbols left on the stack at the end, we know the expression is not well formed.

def paren_match(str)
stack = Stack.new
lsym = "{[(<"
rsym = "}])>"
str.each_char do |sym|
if lsym.include? sym
stack.push(sym)
elsif rsym.include? sym
top = stack.peek
if lsym.index(top) != rsym.index(sym)
return false
else
stack.pop
end
# Ignore non-grouped characters...
end
end
# Ensure stack is empty...
return stack.empty?
end

str1 = "(((a+b))*((c-d)-(e*f))"
str2 = "[[(a-(b-c))], [[x,y]]]"

paren_match str1 # false
paren_match str2 # true

The nested nature of this problem makes it natural for a stack-oriented solution. A slightly more complex example would be the detection of unbalanced tags in HTML and XML. The tokens are multiple characters rather than single characters, but the structure and logic of the problem remain exactly the same. Some other common stack-oriented problems are conversion of infix expressions to postfix form (or vice versa), evaluation of a postfix expression (as is done in Java VMs and many other interpreters), and in general any problem that has a recursive solution. In the next section, we’ll take a short look at the relationship between stacks and recursion.

9.2.3 Understanding Stacks and Recursion

As an example of the isomorphism between stack-oriented algorithms and recursive algorithms, we will take a look at the classic “Towers of Hanoi” problem.

According to legend, there is an ancient temple somewhere in the Far East, where monks have the sole task of moving disks from one pole to another while obeying certain rules about the moves they can make. There were originally 64 disks on the first pole; when they finish, the world will come to an end.

As an aside, we like to dispel myths when we can. It seems that in reality, this puzzle originated with the French mathematician Edouard Lucas in 1883 and has no actual basis in eastern culture. What’s more, Lucas himself named the puzzle the “Tower of Hanoi” (in the singular).

So if you were worried about the world ending... don’t worry on that account. And anyway, 64 disks would take 264–1 moves. A few minutes with a calculator reveals that those monks would be busy for millions of years.

But on to the rules of the game. (Probably every first-year computer science student in the world has seen this puzzle.) We have a pole with a certain number of disks stacked on it; call this the source pole. We want to move all of these to the destination pole, using a third (auxiliary) pole as an intermediate resting place. The catch is that you can only move one disk at a time, and you cannot ever place a larger disk onto a smaller one.

The following example uses a stack to solve the problem. We use only three disks because 64 would occupy a computer for centuries:

def towers(list)
while !list.empty?
n, src, dst, aux = list.pop
if n == 1
puts "Move disk from #{src} to #{dst}"
else
list.push [n-1, aux, dst, src]
list.push [1, src, dst, aux]
list.push [n-1, src, aux, dst]
end
end
end


list = []
list.push([3, "a", "c", "b"])

towers(list)

The output produced is shown here:

Move disk from a to c
Move disk from a to b
Move disk from c to b
Move disk from a to c
Move disk from b to a
Move disk from b to c
Move disk from a to c

Of course, the classic solution to this problem is recursive. And as we already pointed out, the close relationship between the two algorithms is no surprise because recursion implies the use of an invisible system-level stack.

def towers(n, src, dst, aux)
if n==1
puts "Move disk from #{src} to #{dst}"
else
towers(n-1, src, aux, dst)
towers(1, src, dst, aux)
towers(n-1, aux, dst, src)
end
end

towers(3, "a", "c", "b")

The output produced here is the same. And it may interest you to know that we tried commenting out the output statements and comparing the runtimes of these two methods. Don’t tell anyone, but the recursive version is five times as fast.

9.2.4 Implementing a Stricter Queue

We define a queue here in much the same way we defined a stack earlier. If you want to protect yourself from accessing such a data structure in an illegal way, we recommend this practice.

class Queue

def initialize
@store = []
end

def enqueue(x)
@store << x
end

def dequeue
@store.shift
end

def peek
@store.first
end

def length
@store.length
end

def empty?
@store.empty?
end

end

As mentioned, the Queue class in the thread library is needed in threaded code. It is accompanied by a SizedQueue variant that is also thread-safe.

Those queues use the method names enq and deq rather than the longer names shown in the preceding example. They also allow the names push and pop, which seems misleading to me. The data structure is FIFO, not LIFO; that is, it is a true queue and not a stack.

If you need a thread-safe Stack class, I recommend you take the Queue class as a starting point. It should be a relatively quick and easy class to create.

9.3 Working with Trees

I think that I shall never see
A poem as lovely as a tree...

—“Trees,” [Alfred] Joyce Kilmer

Trees in computer science are a relatively intuitive concept (except that they are usually drawn with the “root” at the top and the “leaves” at the bottom). This is because we are familiar with so many kinds of hierarchical data in everyday life—from the family tree, to the corporate organization chart, to the directory structures on our hard drives.

The terminology of trees is rich but easy to understand. Any item in a tree is a node ; the first or topmost node is the root. A node may have descendants that are below it, and the immediate descendants are called children. Conversely, a node may also have a parent (only one) andancestors. A node with no child nodes is called a leaf. A subtree consists of a node and all its descendants. To travel through a tree (for example, to print it out) is called traversing the tree.

We will look mostly at binary trees, though in practice a node can have any number of children. We will discuss how to create a tree, populate it, and traverse it; and we will look at a few real-life tasks that use trees.

We should mention here that in many languages, such as C and Pascal, trees are implemented using true address pointers. But in Ruby (as in Java, for instance), we don’t use pointers; object references work just as well or better.

9.3.1 Implementing a Binary Tree

There is more than one way to implement a binary tree in Ruby. For example, we could use an array to store the values. Here, we use a more traditional approach, coding much as we would in C, except that pointers are replaced with object references.

What is required to describe a binary tree? Well, each node needs an attribute of some kind for storing a piece of data. Each node also needs a pair of attributes for referring to the left and right subtrees under that node.

We also need a way to insert into the tree and a way of getting information out of the tree. A pair of methods will serve these purposes.

The first tree we’ll look at implements these methods in a slightly unorthodox way. Then we will expand on the Tree class in later examples.

A tree is in a sense defined by its insertion algorithm and by how it is traversed. In this first example, shown in Listing 9.1, we define an insert method that inserts in a breadth-first fashion—that is, top to bottom and left to right. This guarantees that the tree grows in depth relatively slowly and that the tree is always balanced. Corresponding to the insert method, the traverse iterator will iterate over the tree in the same breadth-first order.

Listing 9.1 Breadth-First Insertion and Traversal in a Tree


class Tree

attr_accessor :left
attr_accessor :right
attr_accessor :data

def initialize(x=nil)
@left = nil
@right = nil
@data = x
end

def insert(x)
list = []
if @data == nil
@data = x
elsif @left == nil
@left = Tree.new(x)
elsif @right == nil
@right = Tree.new(x)
else
list << @left
list << @right
loop do
node = list.shift
if node.left == nil
node.insert(x)
break
else
list << node.left
end
if node.right == nil
node.insert(x)
break
else
list << node.right
end
end
end
end

def traverse()
list = []
yield @data
list << @left if @left != nil
list << @right if @right != nil
loop do
break if list.empty?
node = list.shift
yield node.data
list << node.left if node.left != nil
list << node.right if node.right != nil
end
end

end


items = [1, 2, 3, 4, 5, 6, 7]

tree = Tree.new

items.each {|x| tree.insert(x)}

tree.traverse {|x| print "#{x} "}
puts

# Prints "1 2 3 4 5 6 7 "


This kind of tree, as defined by its insertion and traversal algorithms, is not especially interesting. It does, however, serve as an introduction and something on which we can build.

9.3.2 Sorting Using a Binary Tree

For random data, a binary tree is a good way to sort. (Although in the case of already sorted data, it degenerates into a simple linked list.) The reason, of course, is that with each comparison, we are eliminating half the remaining alternatives as to where we should place a new node.

Although it might be fairly rare to implement this nowadays, it can’t hurt to know how to do it. The code in Listing 9.2 builds on the previous example.

Listing 9.2 Sorting with a Binary Tree


class Tree

# Assumes definitions from
# previous example...

def insert(x)
if @data == nil
@data = x
elsif x <= @data
if @left == nil
@left = Tree.new x
else
@left.insert x
end
else
if @right == nil
@right = Tree.new x
else
@right.insert x
end
end
end

def inorder()
@left.inorder {|y| yield y} if @left != nil
yield @data
@right.inorder {|y| yield y} if @right != nil
end

def preorder()
yield @data
@left.preorder {|y| yield y} if @left != nil
@right.preorder {|y| yield y} if @right != nil
end

def postorder()
@left.postorder {|y| yield y} if @left != nil
@right.postorder {|y| yield y} if @right != nil
yield @data
end

end

items = [50, 20, 80, 10, 30, 70, 90, 5, 14,
28, 41, 66, 75, 88, 96]

tree = Tree.new

items.each {|x| tree.insert(x)}

tree.inorder {|x| print x, " "}
puts
tree.preorder {|x| print x, " "}
puts
tree.postorder {|x| print x, " "}
puts


# Output:
# 5 10 14 20 28 30 41 50 66 70 75 80 88 90 96
# 50 20 10 5 14 30 28 41 80 70 66 75 90 88 96
# 5 14 10 28 41 30 20 66 75 70 88 96 90 80 50


9.3.3 Using a Binary Tree as a Lookup Table

Suppose we have a tree already sorted. Traditionally this has made a good lookup table; for example, a balanced tree of a million items would take no more than 20 comparisons (the depth of the tree or log base 2 of the number of nodes) to find a specific node. For this to be useful, we assume that the data for each node is not just a single value but has a key value and other information associated with it.

In most if not all situations, a hash or even an external database table will be preferable. But we can implement such a search very simply:

class Tree

# Assumes definitions
# from previous example...

def search(x)
if self.data == x
return self
elsif x < self.data
return left ? left.search(x) : nil
else
return right ? right.search(x) : nil
end
end

end

keys = [50, 20, 80, 10, 30, 70, 90, 5, 14,
28, 41, 66, 75, 88, 96]

tree = Tree.new

keys.each {|x| tree.insert(x)}

s1 = tree.search(75) # Returns a reference to the node
# containing 75...

s2 = tree.search(100) # Returns nil (not found)

9.3.4 Converting a Tree to a String or Array

The same old tricks that allow us to traverse a tree will allow us to convert it to a string or array if we want. Here we assume an in-order traversal, though any other kind could be used:

class Tree

# Assumes definitions from
# previous example...

def to_s
str = "["
str << left.to_s << ", " if left
str << data.inspect
str << ", " << right.to_s if right
str << "]"
end

def to_a
array = []
array << left.to_a if left
array << data
array << right.to_a if right
array
end

end

items = %w[bongo grimace monoid jewel plover nexus synergy]

tree = Tree.new
items.each {|x| tree.insert x}
tree.to_a * ", "
# "bongo, grimace, jewel, monoid, nexus, plover, synergy"
tree.to_a
# ["bongo", ["grimace", [["jewel"], "monoid", [["nexus"],
# "plover", ["synergy"]]]]]

Note that the resulting array is as deeply nested as the depth of the tree from which it came. You can, of course, use flatten to produce a non-nested array.

9.4 Working with Graphs

A graph is a collection of nodes that interconnect with each other arbitrarily. (A tree is a special case of a graph.) We will not delve deeply into the subject of graphs because the theory and terminology can have a steep learning curve. Before long, we would find ourselves wandering out of the field of computer science entirely and into the province of mathematicians.

Yet, graphs do have many practical applications. Consider any ordinary highway map with highways connecting cities, or consider a circuit diagram. These are both best represented as graphs. A computer network can be thought of in terms of graph theory, whether it is a LAN of a dozen systems or the Internet itself with its countless millions of nodes.

When we say graph, we usually mean an undirected graph. In naive terms, this is a graph in which the connecting lines don't have arrows; two nodes are either connected or they are not. By contrast, a directed graph or digraph can have “one-way streets”; just because node x is connected to node y doesn’t mean that the reverse is true. (A node is also commonly called a vertex.) Finally, a weighted graph has connections (or edges) that have weights associated with them; these weights may express, for instance, the “distance” between two nodes. We won’t go beyond these basic kinds of graphs; the interested reader can refer to numerous references in computer science and mathematics.

In Ruby, as in most languages, a graph can be represented in multiple ways—for example, as a true network of interconnected objects or as a matrix storing the set of edges in the graph. We will look at both of these as we review a few practical examples of manipulating graphs.

9.4.1 Implementing a Graph as an Adjacency Matrix

The example here builds on two previous examples. In Listing 9.3, we implement an undirected graph as an adjacency matrix, using the ZeroArray class (see Section 8.1.26, “Establishing a Default Value for New Array Elements”) to make sure that new elements are zero. Also, we inherit from the TriMatrix (see Section 8.1.7, “Using Specialized Indexing Functions”) to get a lower triangular matrix form.

Listing 9.3 Adjacency Matrix


class LowerMatrix < TriMatrix

def initialize
@store = ZeroArray.new
end

end


class Graph

def initialize(*edges)
@store = LowerMatrix.new
@max = 0
edges.each do |e|
e[0], e[1] = e[1], e[0] if e[1] > e[0]
@store[e[0], e[1]] = 1
@max = [@max, e[0], e[1]].max
end
end

def [](x, y)
if x > y
@store[x, y]
elsif x < y
@store[y, x]
else
0
end
end

def []=(x, y, v)
if x > y
@store[x, y]=v
elsif x < y
@store[y, x]=v
else
0
end
end

def edge?(x, y)
x, y = y, x if x < y
@store[x, y]==1
end

def add(x, y)
@store[x,y] = 1
end


def remove(x, y)
x, y = y, x if x < y
@store[x, y] = 0
if (degree @max) == 0
@max -= 1
end
end

def vmax
@max
end

def degree(x)
(0..@max).inject(0){|sum, i| sum + self[x,i] }
end

def each_vertex
(0..@max).each {|v| yield v }
end

def each_edge
(0..@max).each do |v0|
(0..v0-1).each do |v1|
yield v0, v1 if edge?(v0, v1)
end
end
end

end


mygraph = Graph.new([1,0],[0,3],[2,1],[3,1],[3,2])

# Print the degrees of all the vertices: 2 3 2 3
mygraph.each_vertex {|v| puts mygraph.degree(v)}

# Print the list of edges
mygraph.each_edge do |a,b|
puts "(#{a},#{b})"
end

# Remove a single edge
mygraph.remove 1,3

# Print the degrees of all the vertices: 2 2 2 2
mygraph.each_vertex {|v| puts mygraph.degree(v) }


Note that in the kind of graph we are implementing here, a node cannot be connected to itself, and two nodes can be connected by only one edge.

We provide a way to specify edges initially by passing pairs into the constructor. We also provide a way to add and remove edges and detect the presence of edges. The vmax method returns the highest-numbered vertex in the graph. The degree method finds the degree of the specified vertex—that is, the number of edges that connect to it.

Finally, we provide two iterators: each_vertex and each_edge. These iterate over edges and vertices, respectively.

9.4.2 Determining Whether a Graph Is Fully Connected

Not all graphs are fully connected. That is, sometimes “you can’t get there from here”; there may be vertices that are unreachable from other vertices no matter what path you try. Connectivity is an important property of a graph to be able to assess, telling whether the graph is “of one piece.” If it is, every node is ultimately reachable from every other node.

We won’t explain the algorithm in detail here because it is covered in any introductory discrete math text; however, we do offer the Ruby method in Listing 9.4.

Listing 9.4 Determining Whether a Graph Is Fully Connected


class Graph

def connected?
x = vmax
k = [x]
l = [x]
for i in 0..@max
l << i if self[x,i]==1
end
while !k.empty?
y = k.shift
# Now find all edges (y,z)
self.each_edge do |a,b|
if a==y || b==y
z = a==y ? b : a
if !l.include? z
l << z
k << z
end
end
end
end
if l.size < @max
false
else
true
end
end

end


mygraph = Graph.new([0,1], [1,2], [2,3], [3,0], [1,3])

puts mygraph.connected? # true
puts mygraph.euler_path? # true

mygraph.remove 1,2
mygraph.remove 0,3
mygraph.remove 1,3

puts mygraph.connected? # false
puts mygraph.euler_path? # false


I’ve referenced a method here (euler_path?) that you haven’t seen yet. It is defined in Section 9.4.4, “Determining Whether a Graph Has an Euler Path.”

A refinement of this algorithm could be used to determine the set of all connected components (or cliques) in a graph that is not overall fully connected. I won’t cover this here.

9.4.3 Determining Whether a Graph Has an Euler Circuit

There is no branch of mathematics, however abstract, which may not some day be applied to phenomena of the real world.

—Nikolai Lobachevsky

Sometimes we want to know whether a graph has an Euler circuit. This term (pronounced oiler circuit) comes from the mathematician Leonhard Euler, who essentially founded the field of topology by dealing with a particular instance of this question. (A graph of this nature is sometimes called a unicursive graph because it can be drawn without lifting the pen from the paper or retracing.)

In the German town of Königsberg, there was an island in the middle of the river (near where the river split into two parts). Seven bridges crisscrossed at various places between opposite shores and the island. The townspeople wondered whether it was possible to make a walking tour of the city in such a way that you would cross each bridge exactly once and return to your starting place. In 1735, Euler proved that it was impossible. This, then, is not just a classic problem, but the original graph theory problem.

And, as with many things in life, when you discover the answer, it is easy. It turns out that for a graph to have an Euler circuit, it must possess only vertices with even degree. Here we add a little method to check that property:

class Graph

def euler_circuit?
return false if !connected?
(0..@max).each do |i|
return false if degree(i) % 2 != 0
end
true
end

end

mygraph = Graph.new([1,0],[0,3],[2,1],[3,1],[3,2])
mygraph.euler_circuit? # false

mygraph.remove 1,3
mygraph.euler_circuit? # true

9.4.4 Determining Whether a Graph Has an Euler Path

An Euler path is not quite the same as an Euler circuit. The word circuit implies that you must return to your starting point; with a path, we are really only concerned with visiting each edge exactly once. The following code fragment illustrates the difference:

class Graph

def euler_path?
return false if !connected?
odd=0
each_vertex do |x|
if degree(x) % 2 == 1
odd += 1
end
end
odd <= 2
end

end

mygraph = Graph.new([0,1],[1,2],[1,3],[2,3],[3,0])
mygraph.euler_circuit? # false
mygraph.euler_path? # true

9.4.5 Graph Tools in Ruby

A few graphing libraries are available for Rubyists. The most popular are intended to create graphs on websites using JavaScript, but several can also create images. The most popular pure-Ruby graphing gems are gruff, gnuplot, and rubyvis. Another option is to use the ruby-graphviz gem to create a document, and then use GraphViz to render those documents as images or printable Postscript.

In short, there may be a need for more tools of this sort. If you need more than what is already available, I urge you to write your own—or better, join an existing project. If working with graphs becomes easy enough, it may become one of those techniques we wonder how we did without.

9.5 Conclusion

We’ve taken a look here at the Set class in Ruby as well as a few examples of “home-grown” data structures. Where more advanced data structures are concerned, we’ve seen examples of inheriting from an existing class and examples of limited delegation by encapsulating an instance of another class. We’ve seen ways to store data creatively, ways to use various data structures, and how to create iterators for these classes.

We’ve looked at stacks and queues in general, and how they might be used in problem solving. We’ve also taken a cursory look at trees and graphs.

In the next chapter, we are again looking at the manipulation of data. But where we have so far been concerned with objects stored in memory, we will now be looking at secondary storage—working with files (and I/O in general), databases, and persistent objects.