Atoms, Tuples, and Pattern Matching - Introducing Elixir: Getting Started in Functional Programming (2014)

Introducing Elixir: Getting Started in Functional Programming (2014)

Chapter 3. Atoms, Tuples, and Pattern Matching

Elixir programs are at heart a set of message requests and tools for processing them. Elixir provides tools that simplify the efficient handling of those messages, letting you create code that is readable (to programmers at least) while still running efficiently when you need speed.

Atoms

Atoms are a key component of Elixir. Technically they’re just another type of data, but it’s hard to overstate their impact on Elixir programming style.

Usually, atoms are bits of text that start with a colon, like :ok or :earth or :Today. They can also contain underscores (_) and at symbols (@), like :this_is_a_short_sentence or :me@home. If you want more freedom to use spaces, you can start with the colon, and then put them in single quotes, like :'Today is a good day'. Generally, the one-word lowercase form is easier to read.

Atoms have a value — it’s the same as their text:

iex(1)> :test

:test

That’s not very exciting in itself. What makes atoms exciting is the way that they can combine with other types and Elixir’s pattern-matching techniques to build simple but powerful logical structures.

Pattern Matching with Atoms

Elixir used pattern matching to make the examples in Chapter 2 work, but it was very simple. The name of the function was the one key piece that varied, and as long as you provided a numeric argument, Elixir knew what you meant. Elixir’s pattern matching offers much more sophisticated possibilities, however, allowing you to match on arguments as well as on function names.

For example, suppose you want to calculate the velocity of falling objects not just on Earth, where the gravitational constant is 9.8 meters per second squared, but on Earth’s moon, where it is 1.6 meters per second squared, and on Mars, where it is 3.71 meters per second squared. Example 3-1, which you can find in ch03/ex1-atoms, shows one way to build code that supports this.

Example 3-1. Pattern matching on atoms as well as function names

defmodule Drop do

def fall_velocity(:earth, distance) do

:math.sqrt(2 * 9.8 * distance)

end

def fall_velocity(:moon, distance) do

:math.sqrt(2 * 1.6 * distance)

end

def fall_velocity(:mars, distance) do

:math.sqrt(2 * 3.71 * distance)

end

end

It looks like the fall_velocity function gets defined three times here, and it certainly provides three processing paths for the same function. However, because Elixir will choose which function to call by pattern matching, they aren’t duplicate definitions. As in English, these pieces are called clauses. All of the clauses for a given function name must be grouped together in the module.

Once you have this, you can calculate velocities for objects falling a given distance on Earth, the Earth’s moon, and Mars:

iex(2)> c("drop.ex")

[Drop]

iex(3)> Drop.fall_velocity(:earth,20)

19.79898987322333

iex(4)> Drop.fall_velocity(:moon,20)

8.0

iex(5)> Drop.fall_velocity(:mars,20)

12.181953866272849

You’ll quickly find that atoms are a critical component for writing readable Elixir code.

NOTE

If you want to do a pattern match against a value stored in a variable, you’ll need to put a ^ in front of the variable name.

Atomic Booleans

Elixir uses the values true and false to represent the boolean logic values of the same names. Although underneath these are atoms, :true and :false, they are common enough that you don’t need to use the colons. Elixir will return these values if you ask it to compare something:

iex(1)> 3<2

false

iex(2)> 3>2

true

iex(3)> 10 == 10

true

iex(4)> :true == true

true

iex(5)> :false == false

true

Elixir also has special operators that work on these atoms (and on comparisons that resolve to these atoms):

iex(1)> true and true

true

iex(2)> true and false

false

iex(3)> true or false

true

iex(4)> false or false

false

iex(5)> not true

false

The and and or operators both take two arguments. For and, the result is true if and only if the two arguments are true. For or, the result is true if at least one of the arguments is true. If you’re comparing expressions more complicated than true and false, it’s wise to put them in parentheses.

Elixir will automatically take shortcuts on its logic. If it finds, for example, that the first argument in an and is false, it won’t evaluate the second argument and will return false. If the first argument in an or is true, it won’t evaluate the second argument and will return true.

The not operator is simpler, taking just one argument. It turns true into false and false into true. Unlike the other boolean operators, which go between their arguments, not goes before its single argument.

If you try to use these operators with any other atoms, you’ll get an argument error:

iex(6)> not :bonkers

** (ArgumentError) argument error

:erlang.not(:bonkers)

NOTE

Like true and false, Elixir lets you write the atom :nil as nil. There are other atoms that often have an accepted meaning, like :ok and :error, but those are more conventions than a formal part of the language and don’t get special treatment. Their colons are required.

Guards

The fall_velocity calculations work fairly well, but there’s still one glitch: if the function gets a negative value for distance, the square-root (sqrt) function in the calculation will be unhappy:

iex(5)> Drop.fall_velocity(:earth,-20)

** (ArithmeticError) bad argument in arithmetic expression

(stdlib) :math.sqrt(-392.0)

drop.ex:4: Drop.fall_velocity/2

Since you can’t dig a hole 20 meters down, release an object, and marvel as it accelerates to the surface, this isn’t a terrible result. However, it might be more elegant to at least produce a different kind of error.

In Elixir, you can specify which data a given function will accept with guards. Guards, indicated by the when keyword, let you fine-tune the pattern matching based on the content of arguments, not just their shape. Guards have to stay simple, can use only a very few built-in functions, and are limited by a requirement that they evaluate only data without any side effects, but they can still transform your code.

NOTE

You can find a list of functions that can safely be used in guards in Appendix A.

Guards evaluate their expressions to true or false, as previously described, and the first one with a true result wins. That means that you can write when true for a guard that always gets called if it is reached, or block out some code you don’t want to call (for now) with when false.

In this simple case, you can keep negative numbers away from the square-root function by adding guards to the fall_velocity clauses, as shown in Example 3-2, which you can find at ch03/ex2-guards.

Example 3-2. Adding guards to the function clauses

defmodule Drop do

def fall_velocity(:earth, distance) whendistance >= 0 do

:math.sqrt(2 * 9.8 * distance)

end

def fall_velocity(:moon, distance) whendistance >= 0 do

:math.sqrt(2 * 1.6 * distance)

end

def fall_velocity(:mars, distance) whendistance >= 0 do

:math.sqrt(2 * 3.71 * distance)

end

end

The when expression describes a condition or set of conditions in the function head. In this case, the condition is simple: the Distance must be greater than or equal to zero. In Elixir, greater than or equal to is written >=, and less than or equal to is written <=, just as they’re described in English. If you compile that code and ask for the result of a positive distance, the result is the same. Ask for a negative distance, and the result is different:

iex(1)> c("drop.ex")

[Drop]

iex(2)> Drop.fall_velocity(:earth,20)

19.79898987322333

iex(3)> Drop.fall_velocity(:earth,-20)

** (FunctionClauseError) no function clause matching in Drop.fall_velocity/2

drop.ex:3: Drop.fall_velocity(:earth, -20)

Because of the guard, Elixir doesn’t find a function clause that works with a negative argument. The error message may not seem like a major improvement, but as you add layers of code, “not handled” may be a more appealing response than “broke my formula.”

A clearer, though still simple, use of guards might be code that returns an absolute value. Yes, Elixir has a built-in function, abs, for this, but Example 3-3 makes clear how this works.

Example 3-3. Calculating absolute value with guards

defmodule MathDemo do

def absolute_value(number) whennumber < 0 do

-number

end

def absolute_value(number) whennumber == 0 do

0

end

def absolute_value(number) whennumber > 0 do

number

end

end

When mathdemo:absolute_value is called with a negative (less than zero) argument, Elixir calls the first clause, which returns the negation of that negative argument, making it positive. When the argument equals (==) zero, Elixir calls the second clause, returning 0. Finally, when the argument is positive, Elixir calls the third clause, just returning the number. (The first two clauses have processed everything that isn’t positive, so the guard on the last clause is unnecessary and will go away in Example 3-4.)

iex(1)> c("mathDemo.ex")

[MathDemo]

iex(2)> MathDemo.absolute_value(-20)

20

iex(3)> MathDemo.absolute_value(0)

0

iex(4)> MathDemo.absolute_value(20)

20

This may seem like an unwieldy way to calculate. Don’t worry — Elixir has simpler logic switches you can use inside of functions. However, guards are critically important to choosing among function clauses, which will be especially useful as you start to work with recursion in Chapter 4.

Elixir runs through the function clauses in the order you list them and stops at the first one that matches. If you find your information is heading to the wrong clause, you may want to reorder your clauses or fine-tune your guard conditions.

Also, when your guard clause is testing for just one value, you can easily switch to using pattern matching instead of a guard. This absolute_value function in Example 3-4 does the same thing as the one in Example 3-3.

Example 3-4. Calculating absolute value with guards and pattern matching

defmodule MathDemo do

def absolute_value(number) whennumber < 0 do

-number

end

def absolute_value(0) do

0

end

def absolute_value(number) do

number

end

end

In this case, it’s up to you whether you prefer the simpler form or preserving a parallel approach.

NOTE

You can also have multiple comparisons in a single guard. If you separate them with an or statement, it succeeds if any of the comparisons succeeds. If you separate them with an and statement, they all have to succeed for the guard to succeed.

Underscoring That You Don’t Care

Guards let you specify more precise handling of incoming arguments. Sometimes you may actually want handling that is less precise, though. Not every argument is essential to every operation, especially when you start passing around complex data structures. You could create variables for arguments and then never use them, but you’ll get warnings from the compiler (which suspects you must have made a mistake), and you may confuse other people using your code who are surprised to find your code cares about only half of the arguments they sent.

You might, for example, decide that you’re not concerned with what planemo (for planetary mass object, including planets, dwarf planets, and moons) a user of your velocity function specifies, and you’re just going to use Earth’s value for gravity. Then, you might write something likeExample 3-5, from ch03/ex3-underscore.

Example 3-5. Declaring a variable and then ignoring it

defmodule Drop do

def fall_velocity(planemo, distance) whendistance >= 0 do

:math.sqrt(2 * 9.8 * distance)

end

end

This will compile, but you’ll get a warning, and if you try to use it for, say, Mars, you’ll get the wrong answer for Mars:

iex(1)> c("drop.ex")

drop.ex:3: variable planemo is unused

[Drop]

iex(2)> Drop.fall_velocity(:mars,20)

19.79898987322333

On Mars, that should be more like 12 than 19, so the compiler was right to scold you.

Other times, though, you really only care about some of the arguments. In these cases, you can use a simple underscore (_). The underscore accomplishes two things: it tells the compiler not to bother you, and it tells anyone reading your code that you’re not going to be using that argument. In fact, Elixir won’t let you. You can try to assign values to the underscore, but Elixir won’t give them back to you. It considers the underscore permanently unbound:

iex(3)> _ = 20

20

iex(4)> _

** (CompileError) iex:4 unbound variable _

:erl_eval.exprs/2

If you really wanted your code to be Earth-centric and ignore any suggestions of other planemos, you could instead write something like Example 3-6.

Example 3-6. Deliberately ignoring an argument with an underscore

defmodule Drop2 do

def fall_velocity(_, distance) whendistance >= 0 do

:math.sqrt(2 * 9.8 * distance)

end

end

This time there will be no compiler warning, and anyone who looks at the code will know that first argument is useless:

iex(4)> c("drop2.ex")

drop.ex:1: redefining module Drop

[Drop]

iex(5)> Drop2.fall_velocity(:you_dont_care,20)

19.79898987322333

You can use underscore multiple times to ignore multiple arguments. It matches anything for the pattern match and never binds, so there’s never a conflict.

NOTE

You can also start variables with underscores — like _planemo — and the compiler won’t warn if you never use those variables. Those variables do get bound, and you can reference them later in your code if you change your mind. However, if you use the same variable name more than once in a set of arguments, even if the variable name starts with an underscore, you’ll get an error from the compiler for trying to bind twice to the same name.

Adding Structure: Tuples

Elixir’s tuples let you combine multiple items into a single composite datatype. This makes it easier to pass messages between components, letting you create your own complex datatypes as needed. Tuples can contain any kind of Elixir data, including numbers, atoms, other tuples, and the lists and strings you’ll encounter in later chapters.

Tuples themselves are simple, a group of items surrounded by curly braces:

iex(1)> {:earth, 20}

{:earth,20}

Tuples might contain one item, or they might contain 100. Two to five seem typical (and useful, and readable). Often (but not always) an atom at the beginning of the tuple indicates what it’s really for, providing an informal identifier of the complex information structure stored in the tuple.

Elixir includes built-in functions that give you access to the contents of a tuple on an item-by-item basis. You can retrieve the values of items with the elem function, set values in a new tuple with the put_elem function, and find out how many items are in a tuple with the tuple_sizefunction. Elixir (unlike Erlang) counts from zero, so the first item in a tuple is referenced as 0, the second as 1, and so on:

iex(2)> tuple={:earth,20}

{:earth,20}

iex(3)> elem(tuple,1)

20

iex(4)> newTuple=put_elem(tuple,1,40)

{:earth,40}

iex(5)> tuple_size(newTuple)

2

If you can stick with pattern matching tuples, however, you’ll likely create more readable code.

Pattern Matching with Tuples

Tuples make it easy to package multiple arguments into a single container and let the receiving function decide what to do with them. Pattern matching on tuples looks much like pattern matching on atoms, except that there is, of course, a pair of curly braces around each set of arguments, asExample 3-7, which you’ll find in ch03/ex4-tuples, demonstrates.

Example 3-7. Encapsulating arguments in a tuple

defmodule Drop do

def fall_velocity({:earth, distance}) whendistance >= 0 do

:math.sqrt(2 * 9.8 * distance)

end

def fall_velocity({:moon, distance}) whendistance >= 0 do

:math.sqrt(2 * 1.6 * distance)

end

def fall_velocity({:mars, distance}) whendistance >= 0 do

:math.sqrt(2 * 3.71 * distance)

end

end

The arity changes: this version is fall_velocity/1 instead of fall_velocity/2 because the tuple counts as only one argument. The tuple version works much like the atom version but requires the extra curly braces when you call the function as well:

iex(1)> c("drop.ex")

[Drop]

iex(2)> Drop.fall_velocity({:earth,20})

19.79898987322333

iex(3)> Drop.fall_velocity({:moon,20})

8.0

iex(4)> Drop.fall_velocity({:mars,20})

12.181953866272849

Why would you use this form when it requires a bit of extra typing? Using tuples opens more possibilities. Other code could package different things into tuples — more arguments, different atoms, even functions created with fn(). Passing a single tuple rather than a pile of arguments gives Elixir much of its flexibility, especially when you get to passing messages between different processes.

Processing Tuples

There are many ways to process tuples, not just the simple pattern matching shown in Example 3-7. If you receive the tuple as a single variable, you can do many different things with it. A simple place to start is using the tuple as a pass through to a private version of the function. That part ofExample 3-8 may look familiar, as it’s the same as the fall_velocity/2 function in Example 3-2. (You can find this at ch03/ex5-tuplesMore.)

Example 3-8. Encapsulating arguments in a tuple and passing them to a private function

defmodule Drop do

def fall_velocity({planemo, distance}) whendistance >= 0 do

fall_velocity(planemo, distance)

end

defp fall_velocity(:earth, distance) do

:math.sqrt(2 * 9.8 * distance)

end

defp fall_velocity(:moon, distance) do

:math.sqrt(2 * 1.6 * distance)

end

defp fall_velocity(:mars, distance) do

:math.sqrt(2 * 3.71 * distance)

end

end

The use of defp for the private versions mean that only fall_velocity/1, the tuple version, is public. The fall_velocity/2 function is available within the module, however. It’s not especially necessary here, but this “make one version public, keep another version with different arity private” is common in situations where you want to make a function accessible but don’t necessarily want its inner workings directly available.

If you call this function — the tuple version, so curly braces are necessary — fall_velocity/1 calls the private fall_velocity/2, which returns the proper value to fall_velocity/1, which will return it to you. The results should look familiar:

iex(1)> c("drop.ex")

[Drop]

iex(2)> Drop.fall_velocity({:earth,20})

19.79898987322333

iex(3)> Drop.fall_velocity({:moon,20})

8.0

iex(4)> Drop.fall_velocity({:mars,20})

12.181953866272849

There are a few different ways to extract the data from the tuple. You could reference the components of the tuple by number using the built-in Kernel macro elem/2, which takes a tuple and a numeric position as its arguments. The first component of a tuple can be reached at position 0, the second at position 1, and so on:

def fall_velocity(where) do

fall_velocity(elem(where,0), elem(where,1))

end

You could also break things up a bit and do pattern matching after getting the variable:

def fall_velocity(where) do

{planemo, distance} = where

fall_velocity(planemo,distance)

end

The result of that last line will be the value the function returns.

The pattern matching is a little different. The function accepted a tuple as its argument and assigned it to the variable where. (If where is not a tuple, the function will fail with an error.) Extracting the contents of that tuple, since we know its structure, can be done with a pattern match inside the function. The planemo and distance variables will be bound to the values contained in the where tuple and can then be used in the call to fall_velocity/2.