Programming Rust (2016)

Chapter 2. A Tour of Rust

In this chapter we’ll look at several short programs to see how Rust’s syntax, types, and semantics fit together to support safe, concurrent, and efficient code. We’ll walk through the process of downloading and installing Rust; show some simple mathematical code; try out a web server based on a third-party library; and use multiple threads to speed up the process of traversing a directory tree.

Downloading and installing Rust

The best way to install Rust, as of this writing, is to visit the language’s web site, https://www.rust-lang.org, and follow the instructions for downloading and installing the language. The site provides pre-built packages for Linux, Macintosh OSX, and Windows. Install the package in the usual way for your system; on Linux, you may need to unpack a tar file and follow the instructions in the README.md file.

Once you’ve completed the installation, you should have three new commands available at your command line:

$ cargo --version

cargo 0.4.0-nightly (553b363 2015-08-03) (built 2015-08-02)

$ rustc --version

rustc 1.3.0 (9a92aaf19 2015-09-15)

$ rustdoc --version

rustdoc 1.3.0 (9a92aaf19 2015-09-15)

In the command-line interactions in this book, the $ character at the beginning of a line is the command prompt; on Windows, the command line would be C:\> or something similar. The command appears to the right, and the program’s output appears on the following lines.

Here we’ve run the three commands we installed, asking each to report which version it is. Taking each command in turn:

§ cargo is Rust’s compilation manager, package manager, and general-purpose tool. You can use Cargo to start a new project; build and run your program; and manage any external libraries your code depends on.

§ rustc is the Rust compiler. Usually we let Cargo invoke the compiler for us, but sometimes it’s useful to run it directly.

§ rustdoc is the Rust documentation tool. If you write documentation in comments of the appropriate form in your program’s source code, rustdoc can build nicely formatted HTML from them. Like rustc, we usually let Cargo run rustdoc for us.

As a convenience, Cargo can create a new Rust package for us, with some standard metadata arranged appropriately:

$ cargo new --bin hello

This command creates a new package directory named hello, and the --bin flag directs Cargo to prepare this as an executable, not a library. Looking inside the package’s top level directory:

$ cd hello

$ ls -la

total 24

drwxrwxr-x. 4 jimb jimb 4096 Sep 22 21:09 .

drwx------. 62 jimb jimb 4096 Sep 22 21:09 ..

-rw-rw-r--. 1 jimb jimb 88 Sep 22 21:09 Cargo.toml

drwxrwxr-x. 6 jimb jimb 4096 Sep 22 21:09 .git

-rw-rw-r--. 1 jimb jimb 7 Sep 22 21:09 .gitignore

drwxrwxr-x. 2 jimb jimb 4096 Sep 22 21:09 src

We can see that Cargo has created a file Cargo.toml to hold metadata for the package. At the moment this file doesn’t contain much:

[package]

name = "hello"

version = "0.1.0"

authors = ["Jim Blandy <jimb@red-bean.com>"]

If our program ever acquires dependencies on other libraries, we can record them in this file, and Cargo will take care of downloading, building, and updating those libraries for us. We’ll cover the Cargo.toml file in detail in [Link to Come].

Cargo has set up our package for use with the git version control system, creating a .git metadata subdirectory, and a .gitignore file. Cargo also supports the Mercurial version control system (sometimes known as hg). By passing the flag --vcs none to the cargo command, you can request that no version control be prepared.

The src subdirectory contains the actual Rust code:

$ cd src

$ ls -l

total 4

-rw-rw-r--. 1 jimb jimb 45 Sep 22 21:09 main.rs

It seems that Cargo has started the program on our behalf. The main.rs file contains the text:

fn main() {

println!("Hello, world!");

}

In Rust, you don’t even need to write your own “Hello, World!” program. We can invoke the cargo run command from any directory in the package to build and run our program:

$ cargo run

Compiling hello v0.1.0 (file:///home/jimb/rust/hello)

Running `/home/jimb/rust/hello/target/debug/hello`

Hello, world!

Here, Cargo has invoked the Rust compiler, rustc, and then run the executable it produced. Cargo places the executable in the target subdirectory at the top of the package:

$ ls -l ../target/debug

total 580

drwxrwxr-x. 2 jimb jimb 4096 Sep 22 21:37 build

drwxrwxr-x. 2 jimb jimb 4096 Sep 22 21:37 deps

drwxrwxr-x. 2 jimb jimb 4096 Sep 22 21:37 examples

-rwxrwxr-x. 1 jimb jimb 576632 Sep 22 21:37 hello

drwxrwxr-x. 2 jimb jimb 4096 Sep 22 21:37 native

$ ../target/debug/hello

Hello, world!

When we’re through, Cargo can clean up the generated files for us:

$ cargo clean

$ ../target/debug/hello

bash: ../target/debug/hello: No such file or directory

A simple function

Rust’s syntax is deliberately unoriginal. If you are familiar with C, C++, Java, or JavaScript, you can probably find your way through the general structure of a Rust program. Here is a function that computes the greatest common divisor of two integers, using Euclid’s algorithm:

fn gcd(mut n: u64, mut m: u64) -> u64 {

assert!(n != 0 && m != 0);

while m != 0 {

if m < n {

let t = m; m = n; n = t;

}

m = m % n;

}

The fn keyword introduces a function. Here, we’re defining a function named gcd, which takes two parameters n and m, each of which is of type u64, a unsigned 64-bit integer. The function’s return type appears after the ->: our function returns a u64 value. Four-space indentation is standard Rust style.

Rust’s machine integer type names reflect their size and signedness: i32 is a signed 32-bit integer; u8 is an unsigned eight-bit integer (used for ‘byte’ values), and so on. The isize and usize types hold pointer-sized signed and unsigned integers, 32 bits long on 32-bit platforms, and 64 bits long on 64-bit platforms. Rust also has two floating-point types, f32 and f64, which are the IEEE single- and double-precision floating-point types.

Normally, once a variable’s value has been established, it can’t be changed, but placing the mut keyword (short for “mutable”) before the parameters n and m allows our function body to assign to them. In practice, most variables don’t get assigned to; requiring the mut keyword on those that do flags them for careful attention from the reader.

The function’s body starts with a call to the assert! macro, verifying that neither argument is zero. The ! character marks this as a macro invocation, not a function call. Like the assert macro in C and C++, Rust’s assert! checks that its argument is true, and if it is not, crashes the program with a helpful message including the source location of the failing check; this kind of controlled crash is called a “panic”. Unlike C and C++, in which assertions can be skipped, Rust always checks assertions regardless of how the program was compiled.

The heart of our function is a while loop containing an if statement and an assignment. Unlike C and C++, Rust does not require parenthesis around the conditional expressions, but it does require curly braces around the statements they control.

A let statement declares a local variable, like t in our function. We don’t need to write out t’s type, as long as Rust can infer it from how the variable is used. In our function, the only type that works for t is u64, matching m and n. Rust only infers types within function bodies: you must write out the types of function parameters and return values, as we’ve done here. If we had wanted to spell out t’s type, we could have written:

let t: u64 = m; ...

Rust has a return statement, but we didn’t need one to return our value here. In Rust, a block surrounded by curly braces can be an expression; its value is that of the last expression it contains. The body of our function is such a block, and its last expression is n, so that’s our return value. Likewise, if is an expression whose value is that of the branch that was taken. Rust has no need for a separate ?: conditional operator as in C; one just writes the if-else structure right into the expression.

Writing and running unit tests

Rust has simple support for testing built into the language. To test our gcd function, we can write:

#[test]

fn test_gcd() {

assert_eq!(gcd(2 * 5 * 11 * 17,

3 * 7 * 13 * 19),

1);

assert_eq!(gcd(2 * 3 * 5 * 11 * 17,

3 * 7 * 11 * 13 * 19),

3 * 11);

}

Here we define a function named test_gcd which calls gcd and checks that it returns correct values. The #[test] atop the definition marks test_gcd as a test function, to be skipped in normal compilations, but included and called automatically if we run our program with the cargo test command. Let’s assume we’ve edited our gcd and test_gcd definitions into the hello package we created at the beginning of the chapter. If our current directory is somewhere within the package’s subtree, we can run the tests as follows:

$ cargo test

Compiling hello v0.1.0 (file:///home/jimb/rust/hello)

Running /home/jimb/rust/hello/target/debug/hello-2375a82d9e9673d7

running 1 test

test test_gcd ... ok

test result: ok. 1 passed; 0 failed; 0 ignored; 0 measured

We can have test functions scattered throughout our source tree, placed next to the code they exercise, and cargo test will automatically gather them up and run them all.

The #[test] marker is an example of an attribute. Attributes are an open-ended system for marking functions and other declarations with extra information, somewhat like attributes in C# or annotations in Java. They’re used to control compiler warnings and code style checks, include code conditionally (like #if in C and C++), tell Rust how to interact with code written in other languages, and much else. We’ll see more examples of attributes as we go.

Handling command-line arguments

If we want our program to take a series of numbers as command-line arguments and print their greatest common divisor, we can replace the main function with the following:

use std::io::Write;

use std::str::FromStr;

fn main() {

let mut numbers = Vec::new();

for arg in std::env::args().skip(1) {

numbers.push(u64::from_str(&arg)

.expect("error parsing argument"));

}

if numbers.len() == 0 {

writeln!(std::io::stderr(), "Usage: gcd NUMBER ...").unwrap();

std::process::exit(1);

}

let mut d = numbers[0];

for m in &numbers[1..] {

d = gcd(d, *m);

}

println!("The greatest common divisor of {:?} is {}",

numbers, d);

}

This is a large block of code, so let’s take it piece by piece:

use std::io::Write;

use std::str::FromStr;

The use declarations bring the two traits Write and FromStr into scope. We’ll cover traits in detail in [Link to Come], but for now, we’ll simply say that a trait is a collection of methods a type can implement. Any type that implements the Write has a write_fmt method, which thewriteln! macro uses. A type that implements FromStr has a from_str associated function; we’ll use this method on u64 to parse our command-line arguments.

fn main() {

Our main function doesn’t return a value, so we can simply omit the -> and type that would normally follow the parameter list.

let mut numbers = Vec::new();

We declare a mutable local variable numbers, and initialize it to an empty vector. Vec is Rust’s growable vector type, analogous to C++’s std::vector. Our variable’s type is Vec<u64>, but as before, we don’t need to write it out: Rust will infer it for us. Since we intend to push numbers onto the end of this vector as we parse our command-line arguments, we use the mut keyword to make the vector mutable.

for arg in std::env::args().skip(1) {

Here we use a for loop to process our command-line arguments, setting the variable arg to each argument in turn, and evaluating the loop body.

The std::env::args function returns an iterator, a value that produces each argument on demand, and indicates when we’re done. Iterators are ubiquitous in Rust; the standard library includes other iterators that produce the elements of a vector, the lines of a file, messages received on a communications channel, and almost anything else that makes sense to loop over.

Beyond their use with for loops, iterators include a broad selection of methods you can use directly. For example, the first value produced by the iterator returned by std::env::args is always the name of the program being run. We want to skip that, so we call the iterator’s skip method to produce a new iterator that omits that first value.

numbers.push(u64::from_str(&arg)

.expect("error parsing argument"));

Here we call u64::from_str to attempt to parse our command-line argument arg as an unsigned 64-bit integer. Rather than a method we’re invoking on some u64 value we have at hand, u64::from_str is a function associated with the u64 type, akin to a static method in C++ or Java. The from_str function doesn’t return a u64 directly, but rather a Result value that indicates whether the parse succeeded or failed. Each Result is one of two variants:

§ a value written Ok(v), indicating that the parse succeeded and v is the value produced, or

§ a value written Err(e), indicating that the parse failed and e is an error value explaining why.

Almost every function in Rust’s standard library that might encounter an error returns a Result value, carrying values of appropriate types in its Ok and Err variants. Functions that perform input or output or otherwise interact with the operating system all return Result types whose Okvariants carry successful results—a count of bytes transferred, a file opened, and so on—and whose Err variants carry an error code from the system.

We check the success of our parse using Result’s expect method. If the result is some Err(e), expect prints a message that includes a description of e, and exits the program immediately. However, if the result is Ok(v), expect simply returns v itself, which we are finally able to push onto the end of our vector of numbers.

if numbers.len() == 0 {

writeln!(std::io::stderr(), "Usage: gcd NUMBER ...").unwrap();

std::process::exit(1);

}

There’s no greatest common divisor of an empty set of numbers, so we check that our vector has at least one element, and exit the program with an error if it doesn’t. We use the writeln! macro to write our error message to the standard error output stream, provided bystd::io::stderr().

let mut d = numbers[0];

for m in &numbers[1..] {

d = gcd(d, *m);

}

Taking the first number from the vector as our running value d, we compute the greatest common divisor of d and each following vector element, taking each result as the new value of d. As before, we must mark d as mutable, so that we can assign to it in the loop.

The expression &numbers[1..] is a slice of our vector: a reference to a range of elements, starting with the second element and extending to the end of the vector. Iterating over the vector directly, with for m in numbers { ... }, would have for loop take ownership of the vector, consuming it so that we couldn’t use it again later in the program; but if we iterate over the slice instead, the for loop merely borrows the vector’s elements.

The distinction between ownership and borrowing is key to Rust’s memory management and safe concurrency; we’ll discuss it in detail in Chapter 5. For now, however, note that since we are iterating over a reference to a range of numbers’s elements, each iteration sets our loop variable m to a reference to the current element, not the element’s value. The expression *m fetches the number to which m refers, giving us a u64 value we can pass to gcd.

println!("The greatest common divisor of {:?} is {}",

numbers, d);

Finally, we can print our results to our standard output stream. The println! macro takes a template string, substitutes formatted versions of the remaining arguments for the {...} forms as they appear in the template string, and writes the result to the standard output stream. The {:?}form requests that println! show us the “debugging” form of the corresponding argument. Most types in Rust can be printed this way; we use it here to print the vector of numbers. The {} form requests the “display” form of the argument; we use it here to print our final result, d.

Unlike C and C++, which require main to return zero if the program finished successfully, or a non-zero exit status if something went wrong, Rust assumes that if main returns at all, the program finished successfully. Only by explicitly calling functions like expect orstd::process::exit can we cause the program to terminate with an error status code.

The cargo run command allows us to pass arguments to our program, so we can try out our command-line handling:

$ cargo run 42 56

Compiling hello v0.1.0 (file:///home/jimb/rust/hello)

Running `/home/jimb/rust/hello/target/debug/hello 42 56`

The greatest common divisor of [42, 56] is 14

$ cargo run 799459 28823 27347

Running `/home/jimb/rust/hello/target/debug/hello 799459 28823 27347`

The greatest common divisor of [799459, 28823, 27347] is 41

$ cargo run 83

Running `/home/jimb/rust/hello/target/debug/hello 83`

The greatest common divisor of [83] is 83

$ cargo run

Running `/home/jimb/rust/hello/target/debug/hello`

Usage: gcd NUMBER ...

Process didn't exit successfully: `/home/jimb/rust/hello/target/debug/hello` (exit code: 1)

A simple web server

One of Rust’s strengths is the collection of library packages written by the Rust user community and freely available for anyone to use, many of which are published on the web site https://crates.io. The cargo command makes it easy to use a crates.io package from our own code: it will download the right version of the package, build it, and update it as requested. A Rust package, whether a library or an executable, is called a crate; cargo and crates.io both derive their names from this term.

To show how this works, we’ll put together a simple web server using the the iron web framework, the hyper HTTP server, and various other crates on which they depend. Our web site will prompt the user for two numbers, and compute their greatest common divisor:

Web page offering to compute GCD

First, we’ll have cargo create a new package for us, named iron-gcd:

$ cargo new --bin iron-gcd

$ cd iron-gcd

Then, we’ll edit our new project’s Cargo.toml file to list the packages we want to use; its contents should be as follows:

[package]

name = "iron-gcd"

version = "0.1.0"

authors = ["Jim Blandy <jimb@red-bean.com>"]

[dependencies]

iron = "0.2.2"

mime = "0.1.0"

router = "0.0.15"

urlencoded = "0.2.0"

Each line in the [dependencies] section of Cargo.toml gives the name of a crate on crates.io, and the version of that crate we would like to use. There may well be versions of these crates on cargo.io newer than those shown here, but by naming the specific versions we tested this code against, we can ensure the code will continue to compile even as new versions of the packages are published. We’ll discuss version management in more detail in [Link to Come].

Note that we need only name those packages we’ll use directly; cargo takes care of bringing in whatever other packages those need in turn.

For our first iteration, we’ll keep the web server simple: it will serve only the page that prompts the user for numbers to compute with. In iron-gcd/src/main.rs, we’ll place the following text:

extern crate iron;

#[macro_use] extern crate mime;

use iron::prelude::*;

use iron::status;

fn main() {

println!("Serving on http://localhost:3000...");

Iron::new(get_form).http("localhost:3000").unwrap();

}

#[allow(unused_variables)]

fn get_form(request: &mut Request) -> IronResult<Response> {

let mut response = Response::new();

response.set_mut(status::Ok);

response.set_mut(mime!(Text/Html; Charset=Utf8));

response.set_mut(r#"

<title>GCD Calculator</title>

<button type="submit">Compute GCD</button>

</form>

"#);

Ok(response)

}

We start with two extern crate directives, which make the iron and mime crates that we cited in our Cargo.toml file available to our program. The #[macro_use] attribute before the extern crate mime item alerts Rust that we plan to use macros exported by this crate.

Next, we have use declarations to bring in some of those crates’ bindings. The declaration use iron::prelude::* makes all the public bindings of the iron::prelude module directly visible in our own code. Generally, one should prefer to spell out the names one wishes to use, as we did for iron::status; but when a module is named prelude, that generally means that its exports are intended to provide the sort of general facilities that any user of the crate will probably need; here, a wildcard use directive makes a bit more sense.

Our main function is simple: it prints a message reminding us how to connect to our server, calls Iron::new to create a server, and then sets it listening on TCP port 3000 on the local machine. We pass the get_form function to Iron::new, indicating that the server should use that function to handle all requests; we’ll refine this shortly.

The get_form function itself takes a mutable reference, written &mut, to a Request value representing the HTTP request we’ve been called to handle. While this particular handler function never uses its request parameter, we’ll see one later that does. For the time being, we put the attribute #[allow(unused_variables)] atop the function to prevent Rust from printing a warning message about it.

In the body of the function, we build a Response value, set its HTTP status, indicate the media type of the content returned (using the handy mime! macro), and finally supply the actual text of the response. Since the response text is several lines long, we write it using the Rust “raw string” syntax: the letter ‘r', zero or more hash marks (that is, the '#' character), a double quote, and then the contents of the string, terminated by another double quote followed by the same number of hash marks. Any character may occur within a raw string, and no escape sequences are recognized; we can always ensure the string ends where we want it to by supplying enough hash marks.

Our function’s return type, IronResult<Response>, is another variant of the Result type we encountered earlier: here it is either Ok(r) for some successful Response value r, or Err(e) for some error value e. We construct our return value Ok(response) at the bottom of the function body, using the “last expression” syntax to implicitly establish the value of the entire body.

Having written main.rs, we can use the cargo run command to do everything needed to set it running: fetching the needed crates, compiling them, building our own program, linking everything together and starting it up:

$ cargo run

Updating registry `https://github.com/rust-lang/crates.io-index`

Downloading kernel32-sys v0.1.4

Downloading winapi-build v0.1.1

Downloading libressl-pnacl-sys v2.1.6

Downloading mime v0.1.0

Downloading pnacl-build-helper v1.4.10

Downloading plugin v0.2.6

....

Compiling hyper v0.6.14

Compiling iron v0.2.2

Compiling router v0.0.15

Compiling persistent v0.0.7

Compiling bodyparser v0.0.6

Compiling urlencoded v0.2.0

Compiling iron-gcd v0.1.0 (file:///home/jimb/iron-gcd)

Running `target/debug/iron-gcd`

Serving on http://localhost:3000...

At this point, we can visit the given URL in our browser and see the page shown earlier.

Unfortunately, clicking on “Compute GCD” doesn’t do anything, other than navigate our browser to the URL http://localhost:3000/gcd, which then shows the same page; every URL on our server does. Let’s fix that next, using the Router type to associate different handlers with different paths.

First, let’s arrange to be able to use Router without qualification, by adding the following declarations to iron-gcd/src/main.rs:

extern crate router;

use router::Router;

Rust programmers typically gather all their extern crate and use declarations together towards the top of the file, but this isn’t strictly necessary: Rust allows declarations to occur in any order, as long as they appear at the appropriate level of nesting. (Macro definitions and imports are an important exception to this rule; they must appear before they are are used.)

We can then modify our main function to read as follows:

fn main() {

let mut router = Router::new();

router.get("/", get_form);

router.post("/gcd", post_gcd);

println!("Serving on http://localhost:3000...");

Iron::new(router).http("localhost:3000").unwrap();

}

We create a Router, establish handler functions for two specific paths, and then pass this Router as the request handler to Iron::new, yielding a web server that consults the URL path to decide which handler function to call.

Now we are ready to write our post_gcd function:

extern crate urlencoded;

use std::str::FromStr;

use urlencoded::UrlEncodedBody;

fn post_gcd(request: &mut Request) -> IronResult<Response> {

let mut response = Response::new();

let hashmap;

match request.get_ref::<UrlEncodedBody>() {

Err(e) => {

response.set_mut(status::BadRequest);

response.set_mut(format!("Error parsing form data: {:?}\n", e));

return Ok(response);

}

Ok(map) => { hashmap = map; }

}

let unparsed_numbers;

match hashmap.get("n") {

None => {

response.set_mut(status::BadRequest);

response.set_mut(format!("form data has no 'n' parameter\n"));

return Ok(response);

}

Some(nums) => { unparsed_numbers = nums; }

}

let mut numbers = Vec::new();

for unparsed in unparsed_numbers {

match u64::from_str(&unparsed) {

Err(_) => {

response.set_mut(status::BadRequest);

response.set_mut(format!("Value for 'n' parameter not a number: {:?}\n", unparsed));

return Ok(response);

}

Ok(n) => { numbers.push(n); }

}

let mut d = numbers[0];

for m in &numbers[1..] {

d = gcd(d, *m);

}

response.set_mut(status::Ok);

response.set_mut(mime!(Text/Html; Charset=Utf8));

response.set_mut(format!("The greatest common divisor of the numbers {:?} is <b>{}</b>\n",

numbers, d));

Ok(response)

}

The bulk of this function is a series of match statements, which will be unfamiliar to C, C++, Java, and JavaScript programmers, but a welcome sight to Haskell and OCaml developers. We’ve mentioned that a Result is either a value Ok(s) for some success value s, or Err(e) for some error value e. Given some Result res, we can check which variant it is and access whichever value it holds with a match statement of the form:

match res {

Ok(success) => { ... },

Err(error) => { ... }

}

This is a conditional structure, like an if statement or a switch statement: if res is Ok(v), then we run the first branch, with v assigned to the variable success, which is local to its branch. Similarly, if res is Err(e), we assign e to error, and run the second branch. The beauty of amatch statement is that the programmer can only access the value of a Result by first checking which variant it is; one can never misinterpret a failure value as a successful completion.

Rust allows you to define your own types with value-carrying variants, and use match statements to analyze them: Rust calls such types “enumerations”, but these are generally known as “algebraic data types”.

Now that we can read match statements, the structure of post_gcd should be clear:

§ We retrieve a table mapping query parameter names to arrays of values, by calling request.get_ref::<UrlEncodedBody>(), checking for an error and sending an appropriate response back to the client.

§ Within that table, we find the value of the parameter named "n", as is where our form places the numbers entered into the web page. This value will be not a single string but a vector of strings, as query parameter names can be repeated.

§ We walk the vector of strings, parsing each one as an unsigned 64-bit number, and returning an appropriate failure page if any of the strings fali to parse.

§ Finally, we compute the numbers’ greatest common divisor as before, and construct a response describing our results. The format! macro uses the same kind of string template as the writeln! and println! macros, but returns a string value, rather than writing the text to a stream.

The last remaining piece is the gcd function we wrote earlier. With that in place, we can interrupt any servers we might have left running, and re-build and re-start our program:

$ cargo run

Compiling iron-gcd v0.1.0 (file:///home/jimb/iron-gcd)

Running `target/debug/iron-gcd`

Serving on http://localhost:3000...

This time, by visiting http://localhost:3000, entering some numbers, and clicking the “Compute GCD” button, we’ll actually see some results:

Web page showing results of computing GCD

Concurrency

One of Rust’s great strengths is its support for concurrent programming. The same rules that ensure Rust programs are free of memory errors also ensure threads can only share memory in ways that avoid data races. For example:

§ If you use a mutex to coordinate threads making changes to a shared data structure, Rust ensures that you have always locked the mutex before you access the data. In C and C++, the relationship between a mutex and the data it protects is left to the comments.

§ If you want to share read-only data among several threads, Rust ensures that you cannot modify the data accidentally. In C and C++, the type system can help with this, but it’s easy to get it wrong.

§ If you transfer ownership of a data structure from one thread to another, Rust makes sure you have indeed relinquished all access to it. In C and C++, it’s up to you to check that nothing on the sending thread will ever touch the data again.

In this section we’ll walk you through the process of writing your second multi-threaded program.

You may not have noticed it, but you’ve already written your first: the Iron web framework you used to implement the Greatest Common Divisor server uses a pool of threads to run request handler functions. If the server receives simultaneous requests, it may run the get_form andpost_gcd functions in several threads at once. Because those particular functions are so simple, this is obviously safe; but no matter how elaborate the handler functions become, Rust guarantees that any and all data they share is managed in a thread-safe way. This allows Iron to exploit concurrency without worrying that naive handler functions may be unprepared for the consequences: all Rust functions are ready.

This section’s program plots a section of the Mandelbrot set, a fractal produced by iterating a simple function on complex numbers. Plotting the Mandelbrot set is often called an “embarrassingly parallel” algorithm, because the pattern of communication between the threads is so simple; we’ll cover more complex patterns in “Concurrency”, but this task demonstrates some of the essentials.

However, before we can talk about the actual concurrency, we need to describe the supporting code around it.

Parsing pair command-line arguments

This program needs to take several command-line arguments controlling the resolution of the bitmap we’ll write, and the portion of the complex plane that bitmap shows. Since these command-line arguments all follow a common form, here’s a function to parse them:

use std::str::FromStr;

/// Parse the string `s` as a coordinate pair, like `"400x600"` or `"1.0,0.5"`.

///

/// Specifically, `s` should have the form <left><sep><right>, where <sep> is

/// the character given by the `separator` argument, and <left> and <right> are both

/// strings that can be parsed by `T::from_str`.

///

/// If `s` has the proper form, return `Some<(x, y)>`. If it doesn't parse

/// correctly, return `None`.

fn parse_pair<T: FromStr>(s: &str, separator: char) -> Option<(T, T)> {

match s.find(separator) {

None => None,

Some(index) => {

match (T::from_str(&s[..index]), T::from_str(&s[index + 1..])) {

(Ok(l), Ok(r)) => Some((l, r)),

_ => None

}

#[test]

fn test_parse_pair() {

assert_eq!(parse_pair::<i32>("", ','), None);

assert_eq!(parse_pair::<i32>("10,", ','), None);

assert_eq!(parse_pair::<i32>(",10", ','), None);

assert_eq!(parse_pair::<i32>("10,20", ','), Some((10, 20)));

assert_eq!(parse_pair::<i32>("10,20xy", ','), None);

assert_eq!(parse_pair::<f64>("0.5x", 'x'), None);

assert_eq!(parse_pair::<f64>("0.5x1.5", 'x'), Some((0.5, 1.5)));

}

Here is the first example we’ve seen of a generic function:

fn parse_pair<T: FromStr>(s: &str, separator: char) -> Option<(T, T)> {

Immediately following the function’s name, we have the clause <T: FromStr>, which you can read aloud as, “For any type T that implements the FromStr trait...”. This effectively lets us define an entire family of functions at once: parse_pair::<u32> is a function that parses pairs ofi32 values; parse_pair::<f64> parses pairs of floating-point values; and so on. This is very much like a function template in C++. A Rust programmer would call T a “type parameter” of parse_pair. Often Rust will be able to infer type parameters for you, and you won’t need to write them out as we’ve done here.

Our return type is Option<(T, T)>: either None, or a value Some((v1, v2)), where (v1, v2) is a tuple of two values of type T. The parse_pair function doesn’t use an explicit return statement, so its return value is the value of the last (and the only) expression in its body:

match s.find(separator) {

None => None,

Some(index) => {

...

}

The String type’s find method searches the string for a character matching separator. If find returns None, meaning that the separator character doesn’t occur in the string, the entire match expression evaluates to None, indicating that the parse failed. Otherwise, we take index to be the separator’s position in the string.

match (T::from_str(&s[..index]), T::from_str(&s[index + 1..])) {

(Ok(l), Ok(r)) => Some((l, r)),

_ => None

}

This begins to show off the power of the match expression. The argument to the match is this tuple expression:

(T::from_str(&s[..index]), T::from_str(&s[index + 1..]))

The expressions &s[..index] and &s[index + 1..] are slices of the string, preceding and following the separator. The type parameter T’s associated from_str function takes each of these and tries to parse them as a value of type T, producing a tuple of results. This is what we match against.

(Ok(l), Ok(r)) => Some((l, r)),

Our pattern here only matches if both elements of the tuple are Ok variants of the Result type, indicating that both parses succeeded. If so, Some((l, r)) is the value of the match expression, and hence the return value of the function.

_ => None

The wildcard pattern _ matches anything, and ignores its value. If we reach this point, then parse_pair has failed, so we evaluate to None, again providing the return value of the function.

Mapping from pixels to complex numbers

The program needs to work in two related coordinate spaces: each pixel in the output bitmap corresponds to a number on the complex plane. The relationship between these two spaces is determined by command-line arguments. The following function converts from “bitmap space” to “complex number space”:

/// Return the point on the complex plane corresponding to a given pixel in the

/// bitmap.

///

/// `bounds` is a pair giving the width and height of the bitmap. `pixel` is a

/// pair indicating a particular pixel in that bitmap. The `upper_left` and

/// `lower_right` parameters are points on the complex plane designating the

/// area our bitmap covers.

fn pixel_to_point(bounds: (usize, usize),

pixel: (usize, usize),

upper_left: (f64, f64),

lower_right: (f64, f64))

-> (f64, f64)

{

// It might be nicer to find the position of the *middle* of the pixel,

// instead of its upper left corner, but this is easier to write tests for.

let (width, height) = (lower_right.0 - upper_left.0,

upper_left.1 - lower_right.1);

(upper_left.0 + pixel.0 as f64 * width / bounds.0 as f64,

upper_left.1 - pixel.1 as f64 * height / bounds.1 as f64)

}

#[test]

fn test_pixel_to_point() {

assert_eq!(pixel_to_point((100, 100), (25, 75),

(-1.0, 1.0), (1.0, -1.0)),

(-0.5, -0.5));

}

This is simply calculation, so we won’t explain it in detail. However, there are a few things to point out.

lower_right.0

Expressions with this form refer to tuple elements; this refers to the first element of the tuple lower_right.

pixel.0 as f64

This is Rust’s syntax for a type conversion: this expression converts the first element of the tuple pixel to an f64 value. Unlike C and C++, Rust generally refuses to convert between numeric types implicitly; you must write out the conversions you need. This can be tedious, but making explicit which conversions occur when is suprisingly helpful. Implicit integer conversions are a frequent source of security holes in real-world C and C++ code.

Mandelbrot membership calculation

Here is the heart of the program:

extern crate num;

use num::Complex;

/// Try to determine whether the complex number `c` is in the Mandelbrot set.

///

/// A number `c` is in the set if, starting with zero, repeatedly squaring and

/// adding `c` never causes the number to leave the circle of radius 2 centered

/// on the origin; the number instead orbits near the origin forever. (If the

/// number does leave the circle, it eventually flies away to infinity.)

///

/// If after `limit` iterations our number has still not left the circle, return

/// `None`; this is as close as we come to knowing that `c` is in the set.

///

/// If the number does leave the circle before we give up, return `Some(i)`, where

/// `i` is the number of iterations it took.

fn escapes(c: Complex<f64>, limit: u32) -> Option<u32> {

let mut z = Complex { re: 0.0, im: 0.0 };

for i in 0..limit {

z = z*z + c;

if z.norm_sqr() > 4.0 {

return Some(i);

}

return None;

}

The num crate holds various handy extensions to Rust’s standard numeric system, including arbitrary-precision integers, rational numbers, and complex numbers. However, all this program needs is the Complex type, which the num crate defines as follows:

struct Complex<T> {

pub re: T,

pub im: T

}

In other words, for some component type T, a Complex<T> has two fields of type T, representing the real and imaginary components of a complex number. These fields are pub, meaning that code outside the module defining Complex can access them directly. Since this code usesComplex<f64>, re and im are 64-bit floating point values here.

fn escapes(c: Complex<f64>, limit: u32) -> Option<u32> {

This function takes a number c to test for membership, and a limit on how many times it will iterate before giving up. The return type, Option<u32> is interesting. Similar to the Result type we discussed earlier, an Option<u32> value takes one of two forms: Some(v), where v is some value of type u32; or None, which carries no value. The u32 type is an unsigned 32-bit integer. So this function either returns Some(i), if the number c is not in the Mandelbrot set, or None if it is.

let mut z = Complex { re: 0.0, im: 0.0 };

The expression Complex { re: 0.0, im: 0.0 } constructs a Complex<f64> value, a complex zero, by providing a value for each of its fields. This serves as the initial value for a new local variable z.

for i in 0..limit {

The earlier examples showed for loops iterating over command-line arguments and vector elements; this for loop simply iterates over the range of integers starting with 0 and up to (but not including) limit.

z = z*z + c;

This expression applies one iteration to our current number. The num crate arranges to overload Rust’s arithmetic operations, so that you can use them directly on Complex values. We’ll explain how you can do this with your own types in [Link to Come].

if z.norm_sqr() > 4.0 {

return Some(i);

}

If the number leaves the circle of radius two centered on the origin, the function returns the iteration count, wrapped up in Option’s Some variant.

return None;

If we reach this point, the point seems to be orbiting near the origin, so we return Option’s None variant.

To plot the Mandelbrot set, we simply apply escapes to every point in the bitmap:

/// Render a rectangle of the Mandelbrot set into a buffer of pixels.

///

/// The `bounds` argument gives the width and height of the buffer `pixels`,

/// which holds one grayscale pixel per byte. The `upper_left` and `lower_right`

/// arguments specify points on the complex plane corresponding to the upper

/// left and lower right corners of the pixel buffer.

fn render(pixels: &mut [u8],

bounds: (usize, usize),

upper_left: (f64, f64),

lower_right: (f64, f64))

{

assert!(pixels.len() == bounds.0 * bounds.1);

for r in 0 .. bounds.1 {

for c in 0 .. bounds.0 {

let point = pixel_to_point(bounds, (c, r),

upper_left, lower_right);

pixels[r * bounds.0 + c] =

match escapes(Complex { re: point.0, im: point.1 }, 255) {

None => 0,

Some(count) => 255 - count as u8

};

}

This should all look pretty familiar at this point.

Complex { re: point.0, im: point.1 }

This is another use of the syntax for constructing a value of type Complex, in this case producing a complex number from a plain (f64, f64) tuple.

pixels[r * bounds.0 + c] =

match escapes(Complex { re: point.0, im: point.1 }, 255) {

None => 0,

Some(count) => 255 - count as u8

};

If escapes says that c belongs to the set, render colors the corresponding pixel black (0). Otherwise, render assigns numbers that took longer to escape the circle darker colors.

Writing bitmap files

The image crate provides functions for reading and writing a wide variety of image formats, along with some basic image manipulation functions. In particular, it includes an encoder for the PNG image file format, which this program uses to save the final results of the calculation:

extern crate image;

use image::ColorType;

use image::png::PNGEncoder;

use std::fs::File;

use std::io::{Result, Write};

/// Write the buffer `pixels`, whose dimensions are given by `bounds`, to the

/// file named `filename`.

fn write_bitmap(filename: &str, pixels: &[u8], bounds: (usize, usize))

-> Result<()>

{

let output = try!(File::create(filename));

let encoder = PNGEncoder::new(output);

try!(encoder.encode(&pixels[..],

bounds.0 as u32, bounds.1 as u32,

ColorType::Gray(8)));

Ok(())

}

As is the convention for fallible operations in Rust, this function returns a Result value. Since we have no interesting value to return on success, the exact return type is Result<()>, where () is the “unit” type, resembling void in C and C++.

Since the File::create function and PNGEncoder::encoder method also return Result values, we can use the try! macro to check for errors conveniently. An expression try!(E) requires E to evaluate to some Result value; if that value is Ok(v), then the try! expression evaluates to v: the value of try!(E) is the success value of E. However, if E evaluates to some Err(e), then try! returns Err(e) directly from the function in which it occurs.

NOTE

It’s a common beginner’s mistake to attempt to use try! in the main function. However, since main has no return value, this won’t work; you should use Result’s expect method instead. The try! macro is only useful for checking for errors reported by an expression of type Result, from within functions that themselves return Result.

A concurrent Mandelbrot program

Finally, all the pieces are in place, and we can show you the main function, where we can put concurrency to work for us. First, a non-concurrent version for simplicity:

use std::io::Write;

fn main() {

let args: Vec<String> = std::env::args().collect();

if args.len() != 5 {

writeln!(std::io::stderr(),

"Usage: mandelbrot FILE PIXELS UPPERLEFT LOWERRIGHT")

.unwrap();

writeln!(std::io::stderr(),

"Example: {} mandel.png 1000x750 -1.20,0.35 -1,0.20",

args[0])

.unwrap();

std::process::exit(1);

}

let bounds = parse_pair(&args[2], 'x')

.expect("error parsing image dimensions");

let upper_left = parse_pair(&args[3], ',')

.expect("error parsing upper left corner point");

let lower_right = parse_pair(&args[4], ',')

.expect("error parsing lower right corner point");

let mut pixels = vec![0; bounds.0 * bounds.1];

render(&mut pixels[..], bounds, upper_left, lower_right);

write_bitmap(&args[1], &pixels[..], bounds)

.expect("error writing PNG file");

}

After collecting the command-line arguments into a vector of Strings, we parse each one and then begin calculations.

let mut pixels = vec![0; bounds.0 * bounds.1];

This creates a buffer of one-byte grayscale pixel values, whose size is given by bounds, parsed from the command line. Rust doesn’t permit programs to ever read uninitialized values, so this fills the buffer with zeros.

render(&mut pixels[..], bounds, upper_left, lower_right);

Passing the buffer as a mutable slice, we make a single call to the render function, which computes the appropriate color for each pixel in the buffer, given the buffer’s bounds, and the rectangle of the complex plane we’ve selected.

write_bitmap(&args[1], &pixels[..], bounds)

.expect("error writing PNG file");

Finally, we write the pixel buffer out to disk as a PNG file. In this case we pass the buffer as a shareable (non-mutable) slice, since write_bitmap should have no need to modify the buffer’s contents.

The natural way to distribute this calculation across multiple processors is to divide the image up into sections, one per processor, and let each procesor color the pixels assigned to it. Only when all processors have finished should we write out the pixels to disk.

The crossbeam crate provides a number of valuable concurrency facilities, including a scoped thread facility which does exactly what we need here. To use it, we must add the following line at the top of the file:

extern crate crossbeam;

Then we need to take out the single line calling render, and replace it with the following:

let threads = 8;

let band_rows = bounds.1 / threads + 1;

{

let bands: Vec<_> = pixels.chunks_mut(band_rows * bounds.0).collect();

crossbeam::scope(|scope| {

for (i, band) in bands.into_iter().enumerate() {

let top = band_rows * i;

let height = band.len() / bounds.0;

let band_bounds = (bounds.0, height);

let band_upper_left = pixel_to_point(bounds, (0, top),

upper_left, lower_right);

let band_lower_right = pixel_to_point(bounds, (bounds.0, top + height),

upper_left, lower_right);

scope.spawn(move || {

render(band, band_bounds, band_upper_left, band_lower_right);

});

}

});

}

Breaking this down in the usual way:

let threads = 8;

let band_rows = bounds.1 / threads + 1;

Here we decide to use eight threads. Then we compute how many rows of pixels each band should have. Since the height of a band is band_rows and the overall width of the image is bounds.0, the area of a band, in pixels, is band_rows * bounds.0. We round the row count upwards, to make sure the bands cover the entire bitmap even if the height isn’t a multiple of threads.

let bands: Vec<_> = pixels.chunks_mut(band_rows * bounds.0).collect();

Here we divide the pixel buffer into bands. The buffer’s chunks_mut method returns an iterator producing mutable, non-overlapping slices of the buffer, each of which encloses band_rows * bounds.0 pixels—in other words, band_rows complete rows of pixels. The last slice thatchunks_mut produces may be shorter, but since all the other slices enclosed complete rows from the buffer, the last slice will too. Finally, the iterator’s collect method builds a vector holding these mutable, non-overlapping slices.

Now we can put the crossbeam library to work:

crossbeam::scope(|scope| { ... });

The expression |scope| { ... } is a Rust closure expression. A closure is a value that can be called as if it were a function; here, |scope| is the argument list, and { ... } is the body of the function. Note that, unlike functions declared with fn, we don’t need to declare the types of a closure’s arguments; Rust will infer them, along with its return type.

In this case, crossbeam::scope takes the closure and applies it to a Scope object, representing the lifetime of the group of threads we’ll create to render our horizontal bands.

for (i, band) in bands.into_iter().enumerate() {

Here we iterate over the buffer’s bands. By using the into_iter() iterator, we ensure that each iteration of the loop body takes ownership of its band; and the enumerate adapter attaches an index i to each value produced.

let top = band_rows * i;

let height = band.len() / bounds.0;

let band_bounds = (bounds.0, height);

let band_upper_left = pixel_to_point(bounds, (0, top),

upper_left, lower_right);

let band_lower_right = pixel_to_point(bounds, (bounds.0, top + height),

upper_left, lower_right);

Given the index and the actual size of the band (recall that the last one might be shorter than the others), we can produce a bounding box of the sort render requires, but one that refers only to this band of the buffer, not the entire bitmap. Similarly, we repurpose the renderer’spixel_to_point function to find where the band’s upper left and lower right corners fall on the complex plane.

scope.spawn(move || {

render(band, band_bounds, band_upper_left, band_lower_right);

});

Finally, we create a thread, running the closure move || { ... }. This syntax is a bit strange to read: it denotes a closure of no arguments whose body is the { ... } form. The move keyword at the front indicates that this closure takes ownership of the variables it uses; in particular, only the closure may use the mutable slice band.

The crossbeam::scope call ensures that all threads have completed before it returns, meaning that it is safe to save the bitmap to a file, which is our next action.

Running the Mandelbrot plotter

We’ve used several external crates in this program: num for complex number arithmetic; image for writing PNG files; and crossbeam for the scoped thread creation primitives. Here’s the Cargo.toml file that describes those dependencies:

[package]

name = "mandelbrot"

version = "0.1.0"

authors = ["Jim Blandy <jimb@red-bean.com>"]

[dependencies]

crossbeam = "0.1.5"

num = "0.1.27"

image = "0.3.14"

With that in place, we can build and run the program:

$ cargo build --release

Compiling rustc-serialize v0.3.16

...

Compiling png v0.3.1

Compiling image v0.3.14

Compiling mandelbrot v0.1.0 (file:///home/jimb/rust/mandelbrot)

$ time target/release/mandelbrot mandel.png 4000x3000 -1.20,0.35 -1,0.20

real 0m2.525s

user 0m6.254s

sys 0m0.013s

Here, we’ve used the Unix time program to see how long the program took to run; note that even though we spent more than six seconds of processor time computing the image, the elapsed real time was only two and a half seconds. You can verify that a substantial portion of that real time is spent writing the image file by commenting out the code that does so; on the laptop where this code was tested, the concurrent version reduces the Mandelbrot calculation time proper by a factor of almost four.

This command should create a file called mandel.png, which you can view with your system’s image viewing program, or visit in a web browser.

Safety is invisible

In the end, the program we’ve written here is not substantially different from what we might write in any other language: we apportion pieces of the pixel buffer out amongst the processors; let each one work on its piece separately; and when they’ve all finished, present the result. So then, what is so special about Rust’s concurrency support?

What we haven’t shown here is all the Rust programs we cannot write. The code above partitions the buffer amongst the threads correctly, but there are many small variations on that code that do not, and introduce data races; not one of those variations will pass the Rust compiler’s static checks. A C or C++ compiler will cheerfully help you explore the vast space of programs with subtle data races; Rust tells you, up front, when something could go wrong.

In Chapter 5 and “Concurrency”, we’ll describe Rust’s rules for memory safety, and explain how these rules also ensure proper concurrency hygiene.