Lists - Introducing Elixir: Getting Started in Functional Programming (2014)

Introducing Elixir: Getting Started in Functional Programming (2014)

Chapter 6. Lists

Elixir is great at handling lists, long series of similar (or not) values. List processing makes it easy to see the value of recursion and offers opportunities to get a lot of work done for very little effort.

List Basics

An Elixir list is an ordered set of elements. Generally you will process a list in order, from the first item (the head) to the last item, though there are times when you may want to grab a particular item from the list. Elixir also provides built-in functions for manipulating lists when you don’t want to go through the entire sequence.

Elixir syntax encloses lists in square brackets and separates elements with commas. A list of numbers might look like the following:

[1,2,4,8,16,32]

The elements can be of any type, including numbers, atoms, tuples, strings, and other lists. When you’re starting out, it’s definitely easiest to work with lists that contain only a single type of element, rather than mixing all the possibilities, but Elixir itself has no such constraint. There is also no limit on the number of items a list can contain, though eventually you may find practical limits of memory.

You can pattern match with lists just as you can with other Elixir data structures:

iex(1)> [1,x,4,y] = [1,2,4,8]

[1,2,4,8]

iex(2)> x

2

iex(3)> y

8

NOTE

Your code will usually make more sense if you use tuples to handle data structures containing various kinds of data in a known sequence, and lists to handle structures containing less varied data in unknown quantities. Tuples are expected to come in a certain order and can also contain lists, so if you have a data structure that’s mostly known except for an expanding part or two, including a list inside of a tuple can be a workable solution.

Lists can contain lists, and sometimes this can produce surprising results. If, for example, you want to add a list to a list, you may end up with more levels of list than you planned:

iex(4)> insert = [2,4,8]

[2,4,8]

iex(5)> full = [1,insert,16,32]

[1,[2,4,8],16,32]

You can fix that (if you want to) with the List.flatten/1 function:

iex(6)> neat = List.flatten(full)

[1,2,4,8,16,32]

This also means that if you want to append lists, you need to decide whether you’re creating a list of lists or a single list containing the contents of the component lists. To create a list of lists, you just put lists into lists:

iex(7)> a = [1,2,4]

[1,2,4]

iex(8)> b = [8,16,32]

[8,16,32]

iex(9)> list_of_lists = [a,b]

[[1,2,4],[8,16,32]]

To create a single list from multiple lists, you can use the Enum.concat/2 function or the equivalent ++ operator:

iex(10)> combined = Enum.concat(a, b)

[1,2,4,8,16,32]

iex(11)> combined2 = a ++ b

[1,2,4,8,16,32]

Both produce the same result: a combined and flattened list.

NOTE

The ++ operator is right associative, which can change the order of the resulting list when you append multiple lists.

If you have a set of lists you’d like combined, you can use the Enum.concat/1 function, which takes a list of lists as its argument and returns a single list containing their contents:

iex(12)> c = [64,128,256]

[64,128,256]

iex(13)> combined3 = Enum.concat([a,b,c])

[1,2,4,8,16,32,64,128,256]

Splitting Lists into Heads and Tails

Lists are a convenient way to hold piles of similar data, but their great strength in Elixir is the way they make it easy to do recursion. Lists are a natural fit for the counting-down style of logic explored in Chapter 4: you can run through a list until you run out of items. In many languages, running through a list means finding out how many items it contains and going through them sequentially. Elixir takes a different approach, letting you process the first item in a list, the head, while extracting the rest of the list, the tail, so that you can pass it to another call recursively.

To extract the head and the tail, you use pattern matching, with a special form of the list syntax on the left:

[head | tail] = [1,2,4]

The two variables separated by a vertical bar (|), or cons, for list constructor, will be bound to the head and tail of the list on the right. In the console, Elixir will just report the contents of the right side of the expression, not the fragments created by the pattern match, but if you work through a list you can see the results:

iex(1)> list = [1, 2, 4]

[1,2,4]

iex(2)> [h1 | t1] = list

[1,2,4]

iex(3)> h1

1

iex(4)> t1

[2,4]

iex(5)> [h2 | t2] = t1

[2,4]

iex(6)> h2

2

iex(7)> t2

[4]

iex(8)> [h3 | t3] = t2

[4]

iex(9)> h3

4

iex(10)> t3

[]

iex(11)> [h4 | t4] = t3

** (MatchError) no match of right hand side value: []

:erl_eval.expr/3

Line 2 copies the initial list into two smaller pieces. h1 will contain the first item of the list, whereas t1 will contain a list that has everything except the first element. Line 5 repeats the process on the smaller list, breaking t1 into a h2 and a t2. This time t2 is still a list, as shown on line 7, but contains only one item. Line 8 breaks that single-item list again, putting the value into h3 and an empty list into t3.

What happens when you try to split an empty list, as shown on line 11? Elixir reports an error, "no match…". This fortunately does not mean that recursion on lists is doomed to produce errors. That lack of a match will naturally stop the recursive process, which is probably what you want.

NOTE

Head and tail work only moving forward through a list. If order matters and you really need to go through a list backwards, you’ll need to use the Enum.reverse function and then walk through the reversed list.

Processing List Content

The head-and-tail notation was built for recursive processing. Actually making that work typically follows a pattern in which a list arrives as an argument and is then passed to another (usually private) function with an accumulator argument. A simple case might perform a calculation on the contents of the list. Example 6-1, in ch06/ex1-product, shows this pattern in use, multiplying the values of a list together.

Example 6-1. Calculating the product of values in a list

defmodule Overall do

def product([]) do

0

end

def product(list) do

product(list, 1)

end

def product([], accumulated_product) do

accumulated_product

end

def product([head | tail], accumulated_product) do

product(tail, head * accumulated_product)

end

end

In this module, the product/1 function is the gateway, passing the list (if the list has content) plus an accumulator to product/2, which does the real work. If you wanted to test the arriving list to make sure it meets your expectations, it would probably make the most sense to do that work in product/1, and let product/2 focus on recursive processing.

NOTE

Is the product of an empty list really zero? It might make more sense for an empty list to fail and produce a crash. Elixir’s “let it crash” philosophy is, as you’ll see later, pretty calm about such things. In the long run, you’ll have to decide which cases are better left to crash and which shouldn’t.

The product/2 function has two clauses. The first matches the empty list and will get called at the end of the recursive process when there are no more entries to process, or if the list arrives empty. It returns its second argument, the accumulator.

The second clause does more work if the arriving list is not empty. First, the pattern match ([head|tail]) splits off the first value the list from the rest of the list. Next, it calls product/2 again, with the remaining (if any) portion of the list and a new accumulator that is multiplied by the value of the first entry in the list. The result will be the product of the values included in the list:

iex(1)> c("overall.ex")

[Overall]

iex(2)> Overall.product([1,2,3,5])

30

That went smoothly, but what happened? After product/1 called product/2, it made five iterations over the list, concluding with an empty list, as shown in Table 6-1.

Table 6-1. Recursive processing of a simple list in product/2

Arriving list

Arriving product

Head

Tail

[1,2,3,5]

1

1

[2,3,5]

[2,3,5]

1 (1*1)

2

[3,5]

[3,5]

2 (1*2)

3

[5]

[5]

6 (2*3)

5

[]

[]

30 (6*5)

None

None

The last arriving accumulated_product, 30, will be handled by the clause for the empty list and reported as the return value for product/2. When product/1 receives that value, it will also report 30 as its return value and exit.

NOTE

Because strings in single quotes are lists, you can do strange things like enter Overall.product('funny'). product/1 will interpret the character values as numbers and return 17472569400.

Creating Lists with Heads and Tails

While there are times you want to calculate a single value from a list, much list processing involves modifying lists or converting a list into another list. Because you can’t actually change a list, modifying or converting a list means creating a new list. To do that, you use the same vertical bar head/tail syntax, but on the right side of the pattern match instead of the left. You can try this out in the console, though it’s more useful in a module:

iex(5)> x = [1|[2,3]]

[1,2,3]

Elixir interprets [1|[2,3]] as creating a list. If the value to the right of the vertical bar is a list, it gets appended to the head as a list. In this case, the result is a neat list of numbers. There are a few other forms you should be aware of:

iex(2)> y = [1,2 | [3]]

[1,2,3]

iex(3)> z = [1,2 | 3]

[1,2|3]

In line 2, there isn’t a list wrapped around the now two items in the head, but the constructor still blends the head and the tail together seamlessly. (If you do wrap them in square brackets, the list constructor assumes that you want a list as the first item in the list, so [[1,2] | [3]] will produce [[1,2],3].)

However, line 3 demonstrates what happens if you don’t wrap the tail in square brackets — you get a list, called an improper list, that still contains a constructor, with a strange tail. Until you’ve learned your way quite thoroughly around Elixir, you definitely should avoid this, as it will create runtime errors if you try to process it as a normal list. Eventually you may find rare reasons to do this or encounter code that uses it.

More typically, you’ll use list constructors to build lists inside recursive functions. Example 6-2, which you can find in ch06/ex2-drop, starts from a set of tuples representing planemos and distances. With the help of the drop module from Example 3-8, it creates a list of velocities for the corresponding falls.

Example 6-2. Calculating a series of drop velocities, with an error

defmodule ListDrop do

def falls(list) do

falls(list, [])

end

def falls([], results) do

results

end

def falls([head|tail], results) do

falls(tail, [Drop.fall_velocity(head) | results])

end

end

Much of this is familiar from Example 6-1, except that the results variable gets a list instead of a number, and the last line of falls/2 creates a list instead of a single value. If you run it, however, you’ll see one minor problem:

iex(1)> c("drop.ex")

[Drop]

iex(2)> c("listdrop.ex")

[ListDrop]

iex(3)> ListDrop.falls([{:earth, 20}, {:moon, 20}, {:mars, 20}])

[12.181953866272849,8.0,19.79898987322333]

The resulting velocities are reversed: the Earth has more gravity than Mars, and objects should fall faster on Earth. What happened? That last key line in falls/2 is reading a list from the beginning to the end and creating a list from the end to the beginning. That puts the values in the wrong order. Fortunately, as Example 6-3 (in ch06/ex3-drop) demonstrates, this is easy to fix. You need to call Enum.reverse/1 in the clause of the falls/2 function that handles the empty list.

Example 6-3. Calculating a series of drop velocities, with the error fixed

defmodule ListDrop do

def falls(list) do

falls(list, [])

end

def falls([], results) do

Enum.reverse(results)

end

def falls([head|tail], results) do

falls(tail, [Drop.fall_velocity(head) | results])

end

end

Now it works:

iex(4)> c("listdrop.ex")

listdrop.ex:1: redefining module ListDrop

[ListDrop]

iex(5)> ListDrop.falls([{:earth, 20}, {:moon, 20}, {:mars, 20}])

[19.79898987322333,8.0,12.181953866272849]

NOTE

You could instead have put the Enum.reverse/1 call in the falls/1 gateway function. Either way is fine, though I prefer to have falls/2 return a finished result.

Mixing Lists and Tuples

As you get deeper into Elixir and pass around more complex data structures, you may find that you’re processing lists full of tuples, or that it would be more convenient to rearrange two lists into a single list of tuples or vice versa. The List module includes easy solutions to these kinds of transformations and searches.

The simplest tools are the List.zip/1 and List.unzip/1 functions. They can turn two lists of the same size into a list of tuples or a list of tuples into a list of two lists:

iex(1)> list1 = [:earth, :moon, :mars]

[:earth,:moon,:mars]

iex(2)> list2 = [9.8, 1.6, 3.71]

[9.8,1.6,3.71]

iex(3)> planemos = List.zip([list1, list2])

[earth: 9.8, moon: 1.6, mars: 3.71]

iex(4)> separate_lists = List.unzip(planemos)

[[:earth,:moon,:mars],[9.8,1.6,3.71]]

The two lists, list1 and list2, have different contents but the same number of items. The List.zip/1 function returns a list containing a tuple for each of the items in the original lists. The List.unzip/1 function takes that list of two-component tuples and splits it out into a list containing two lists.

Building a List of Lists

While simple recursion isn’t too complicated, list processing has a way of turning into lists of lists in various stages. Pascal’s triangle, a classic mathematical tool, is relatively simple to create but demonstrates more intricate work with lists. It starts with a 1 at the top, and then each new row is composed of the sum of the two numbers above it:

1

1 1

1 2 1

1 3 3 1

1 4 6 4 1

...

If those numbers seem familiar, it’s probably because they’re the binomial coefficents that appear when you put (x+y) to a power. That’s just the beginning of this mathematical marvel!

Pascal’s triangle is easily calculated with Elixir in a number of ways. You can apply the list techniques already discussed in this chapter by treating each row as a list, and the triangle as a list of lists. The code will be seeded with the first row — the top 1 — represented as [0,1,0]. The extra zeros make the addition much simpler.

NOTE

This is not intended to be an efficient, elegant, or maximally compact implementation. At this point, a naive implementation likely explains more about lists.

For a first step, Example 6-4, found in ch06/ex4-pascal, calculates rows individually. This is a simple recursive process, walking over the old list and adding its contents to create a new list.

Example 6-4. Calculating a row

defmodule Pascal do

def add_row(initial) do

add_row(initial, 0, [])

end

def add_row([], 0, final) do

[0 | final]

end

def add_row([h | t], last, new) do

add_row(t, h, [last + h | new])

end

end

The add_row/1 function sets things up, sending the current row a 0 to get the math started and an empty list you can think of as “where the results go,” though it is really an accumulator. The add_row/3 function has two clauses. The first checks to see if the list being added is empty. If it is, then the function reports back the final row, adding a 0 at the front.

Most of the work gets done in the second clause of add_row/3. When it receives its arguments, the [h | t] pattern match splits the head of the list into the h value (a number) and the tail into t (a list, which may be empty if that was the last number). It also gets values for the lastnumber processed and the current new list being built.

It then makes a recursive call to add_row/3. In that new call, the tail of the old list, t, is the new list to process, the h value becomes the last number processed, and the third argument, the list, opens with the actual addition being performed, which is then combined with the rest of the newlist being built.

NOTE

Because the lists in the triangle are symmetrical, there is no need to use Enum.reverse/1 to flip them. You can, of course, if you want to.

You can test this easily from the console, but remember that your test lists need to be wrapped in zeros:

iex(1)> c("pascal.ex")

[Pascal]

iex(2)> Pascal.add_row([0,1,0])

[0,1,1,0]

iex(3)> Pascal.add_row([0, 1, 1, 0])

[0,1,2,1,0]

iex(4)> Pascal.add_row([0, 1, 2, 1, 0])

[0,1,3,3,1,0]

Now that you can create a new row from an old one, you need to be able to create a set of rows from the top of the triangle, as shown in Example 6-5, which you can find in ch06/ex4-pascal. The add_row/3 function effectively counted down to the end of the list, but triangle/3 will need to count up to a given number of rows. The triangle/1 function sets things up, defining the initial row, setting the counter to 1 (because that initial row is the first row), and passing on the number of rows to be created.

The triangle/3 function has two clauses. The first, the stop clause, halts the recursion when enough rows have been created and reverses the list. (The individual rows may be symmetrical, but the triangle itself is not.) The second clause does the actual work of generating new rows. It gets the previous row generated from the list, and then it passes that to the add_row/1 function, which will return a new row. Then it calls itself with the new list, an incremented count, and the rows value the stop clause needs.

Example 6-5. Calculating the whole triangle with both functions

defmodule Pascal do

def triangle(rows) do

triangle([[0,1,0]], 1, rows)

end

def triangle(list, count, rows) whencount >= rows do

Enum.reverse(list)

end

def triangle(list, count, rows) do

[previous | _] = list

triangle([add_row(previous) | list], count + 1, rows)

end

def add_row(initial) do

add_row(initial, 0, [])

end

def add_row([], 0, final) do

[0 | final]

end

def add_row([h | t], last, new) do

add_row(t, h, [last + h | new])

end

end

Happily, this works:

iex(5)> c("pascal.ex")

pascal.ex:1: redefining module Pascal

[Pascal]

iex(6)> Pascal.triangle(4)

[[0,1,0],[0,1,1,0],[0,1,2,1,0],[0,1,3,3,1,0]]

iex(7)> Pascal.triangle(6)

[[0,1,0],[0,1,1,0],[0,1,2,1,0],[0,1,3,3,1,0],[0,1,4,6,4,1,0],[0,1,5,10,10,5,1,0]]

Pascal’s triangle may be a slightly neater set of lists than most you will process, but this kind of layered list processing is a very common tactic for processing and generating lists of data.