Array helpers - D3 on AngularJS: Create Dynamic Visualizations with AngularJS (2014)

D3 on AngularJS: Create Dynamic Visualizations with AngularJS (2014)

Array helpers

D3 is packed with a bunch of object helpers that allow us to massage our data into more meaningful data structures. This process, often called “cleaning” our data is a common task for data visualizers.

We’ll discuss this process in-depth in chapter 7.

JavaScript itself includes many methods to help us modify our datasets in a meaningful way. We can reverse the order of the elements in our array, modify the contents of our arrays, append two together, filter through elements, etc.

The libraries lodash and underscore also provide convenient methods for handling many different cases for which we can modify our data.

We’ll walk through common methods for how we can manipulate our data in this chapter.

Ranges

We’ll often use a range in our data visualization tools. A range is an array containing an arithmetic progress of numbers. These numbers can be integers of floating point numbers. This is particularly useful when we’re creating axis where we want to show a range of numbers in a graph, for instance.

We’ll use the method d3.range([start,] stop[, step]), which takes up to three arguments to generate arrays of incrementing/decrementing sequences:

start

The start argument is defaults to 0 and is considered optional. This is where the range array will start.

stop

The stop argument is required and will determine where the range will stop. Note that the resulting array will exclude the stop number.

step

The step determines how much we’ll increment or decrement our counter by. If it’s positive, then it will increase the next value by the step amount. If it’s negative, then it will decrease the next value by the step amount.

If it’s omitted, then step defaults to 1.

This enables us to generate arrays, such as the following:

1 d3.range(0, 5); // => [0, 1, 2, 3, 4]

2 d3.range(0, 6, 2); // => [0, 2, 4]

3 d3.range(5); // => [0, 1, 2, 3, 4]

4 d3.range(3, 1, -0.5); // => [3, 2.5, 2, 1.5]

Permutations

We can generate permutations of our data based upon their position in an array. This is useful for laying out our data in a table, for instance.

The d3.permute(array, indexes) takes two arguments:

array

The array is the original array that we’ll be pulling the data for our indexes.

indexes

The indexes are a list of integers that represent the location from within the array that we’re interested in.

1 // => ['c', 'b', 'a']

2 d3.permute(['a', 'b', 'c'], [2, 1, 0])

3 // => ['a', 'a', 'b']

4 d3.permute(['a', 'b', 'c'], [0, 0, 1])

5 // => ['c', 'b', undefined]

6 d3.permute(['a', 'b', 'c'], [2, 1, 4])

Accessing and manipulating simple arrays

array.forEach()

array.forEach() is actually not a feature of D3 but commonly underused by developers.

The array.forEach() method is useful for iterating over an array of items without worrying about for-loops or needing to keep track of the current index.

It accepts a two possible arguments:

callback

The callback parameter is a function that will be called for every single element in the array. This function will be called with the arguments:

element

The element is the element value of the array.

index

The index is the integer location inside of the array.

originalArray

The originalArray points to the original array being traversed.

thisArg

The thisArg is an optional argument that allows us to set the this value to be set inside the callback function.

If we pass the thisArg in the array.forEach() method, then this will be the first argument the callback function will be called with. For instance: function(thisArg, element, index, originalArray)

‘LoopingwithoutforEach()’


1 var notes = ['Eb', 'Bb', 'F', 'C'],

2 note, index;

3 for(index = 0; index < notes.length; index++)

4 {

5 note = notes[index];

6 console.log('beat', index, 'note: ', note);

7 }


‘Andwith’


1 ['Eb', 'Bb', 'F', 'C']

2 .forEach(function(note, index) {

3 console.log('beat', index, 'note: ', note);

4 });


tip

If we want to stop the iteration, we can simply return false from the function and it will halt it’s execution.

array.map()

The array.map function is not a feature of D3, but an often overlooked feature of JavaScript.

Similar to the array.forEach() function, we can use it to iterate over items in our array but it has the added feature of allowing us to easily create new arrays by returning the item we’d like to add at our current index.

The new array returned from map will always be the same length of our original array.

A common task in data visualization is to convert data in one format to data in another.

For instance, say we had an array of arrays of values that represent x,y positions at successive time steps, like so [ [1, 2], [3, 4], ... ]), but some other code expects instead to receive an array of objects of the form [{x: 1, y: 2}, { x: 3, y: 4}, ...]. We can concisely achieve this using array.map():

1 // our original positions

2 var positions = [

3 [1, 2],

4 [3, 4],

5 [5, 6],

6 [7, 8],

7 [9, 10]

8 ];

1 // converting data without map()

2 var newPositions = [];

3 for(var i = 0; i < positions.length; i++)

4 {

5 newPositions

6 .push({

7 x: positions[i][0],

8 y: positions[i][1]

9 });

10 }

11

12 // and with map()

13 var newPositions = positions.map(

14 function(pos){

15 return { x: pos[i][0], pos[i][1] };

16 });

array.filter()

The array.filter function is also not strictly a feature of D3 but a method of the array object in modern web browsers. It’s useful for creating a new array given an existing array and some truth test that we provide. This truth test will be run for each element in the array. If the truth test returns true, that element will be added to the resulting array. Otherwise, it will be skipped and not added to the new resulting array.

Let’s walk through a simple concrete example. Say I had an array of numbers, var nums = [52, 76, 33, 32, 99]. To create a new array of just even numbers, we could write the following code without the filter() array method.

1 var evenNums = [];

2 nums.forEach(function(num){

3 if(num % 2 === 0){

4 evenNums.push(num);

5 }

6 });

Or you could use array.filter to do the same thing in one line.

1 var evenNums = nums.filter(function(num){ return num % 2 === 0 });

2 eventNums; // [52, 76, 32]

In this case, num % 2 === 0 is the criteria that determines if the current element should be in the returned array. Think of the code as reading “if the current element % 2 is zero, include it in the resulting array.”

array.sort()

The array.sort is another useful built in array method. It, as you might expect, sorts the elements in an array. It does so in place, meaning, it does not return a new array but instead returns the original.

1 var collection = ['Victor', 'Ari', 'Ian', 'Lewis'];

2 var sorted = collection.sort();

3 sorted; // => [ 'Ari', 'Ian', 'Lewis', 'Victor' ]

4 collection; // => [ 'Ari', 'Ian', 'Lewis', 'Victor' ]

It’s important to know that the array.sort() method sorts elements lexicographically and not numerically by default. This means it sorts the elements as if they were arrays of strings. (This quirk in JavaScript, among all others, probably causes newcomers to the language the most amount of grief and frustration.)

1 var nums = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10];

2 nums.sort(); // => [ 1, 10, 2, 3, 4, 5, 6, 7, 8, 9]

So sort() will convert each element to a string before sorting. To correct, we can instead pass sort() a function that will be used to compare the elements. This sort function will get called with two arguments, a, and b where both the two current elements being compared in sort(). We’ll then either return 1, 0, or -1 depending on if we decide a > b, a === b, or a < b, respectively. To fix our number sorting example above, we can write:

1 var nums = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10];

2 nums.sort(function(a, b){ return a - b}); // note: a - b

3 nums; // => [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

4 nums.sort(function(a, b){ return b - a}); // note: b - a

5 nums; // => [10, 9, 8, 7, 6, 5, 4, 3, 2, 1]

This ability to pass sort() a comparator function also helps when sorting objects. In this example, say we had an array of people objects that each had a name property and we wanted to sort them by name.

1 var people = [{name: 'Victor'}, {name: 'Ari'}, {name: 'Lewis'}];

2 people.sort(function(a, b){

3 if(a.name > b.name){ return 1; }

4 if(a.name < b.name){ return -1; }

5 return 0;

6 });

7 people; // => [{name: 'Ari'}, {name: 'Lewis'}, {name: 'Victor'}]

d3.merge(arrays)

We can merge any number of arrays into a single array. This is similar to the built-in array concat method.

1 d3.merge([[1,2], [3,4], [5,6,7]]);

2 // [1,2,3,4,5,6,7]

3 d3.merge([[1,2], [3, [4, 5]]])

4 // [1, 2, 3, [4, 5]]

d3.zip(arrays)

The d3.zip() method returns an array of arrays where the ith array contains the same ith element from each of the argument arrays. If the arrays contains a single array, then the array contains a one-element array. If we don’t pass any arrays, then the returned array is an empty array.

1 d3.zip([1, 2], [3, 4])

2 // [[1, 3], [2, 4]]

3 d3.zip([1, 2], [3, 4, 5])

4 // [[1, 3], [2, 4]]

5 d3.zip([1, 2])

6 // [[1], [2]]

7 d3.zip([1, 2], [3, 4], [5, 6])

8 // [[1, 3, 5], [2, 4, 6]]

d3.transpose()

We can run a transposition on an array (which is functionally equivalent to d3.zip.apply(null, matrix)) on a matrix. It turns out that transposition of matrices is incredibly useful for quick calculations and matrix math.

The transpose(array) method takes a single argument of a matrix (defined as an array).

1 d3.transpose([[1,2,3], [3,2,1]]);

2 // [[1,3], [2, 2], [3, 1]]

3 d3.transpose([[], [3,2]]);

4 // []

5 d3.transpose([[1,2, 3], [3]]);

6 // [[1, 3]]

d3.pairs()

We also can associated adjacent pairs of elements in an array to return tuples of elements using the d3.pairs() method. This method takes a single argument of an array.

1 d3.pairs([1, 2, 3, 4]);

2 // [ [1,2], [2,3], [3,4]]

3 d3.pairs([1])

4 // []

5 d3.pairs([1,2])

6 // [ [1,2] ]

d3.ascending()/d3.descending()

Often times, data visualizations will need to sort arrays. D3 includes two helpers functions to help us with this type of sorting.

The d3.ascending() and d3.descending() are helper functions that accept two arguments.

For the d3.ascending(a,b), it will return a value of -1 if a is less than b, return 1 if a is greater than b, and 0 if they are equal.

The d3.descending(a,b) function returns the opposite, where it will return -1 if a is larger than b, 1 if a is less than b, and 0 if they are equal.

These functions can be used by passing them into the array sort() function, like so:

1 var arr =[1,28,9,4];

2 arr.sort(d3.ascending);

3 // => [1, 4, 9, 28]

4 arr.sort(d3.descending);

5 // => [28, 9, 4, 1]

extent()

Extent will return an array of two elements, the minimum and the maximum of the array:

1 var arr = [1,28,9,4];

2 d3.extent(arr);

3 // => [1, 28]

min(), max(), sum(), mean(), median()

Often times we want to pull out metrics of our array. D3 gives us the ability to pull out key metrics of our arrays with some helper functions.

Each one of these functions takes up to two arguments:

array

The array argument is the array of which we will run the function on.

accessor

The accessor function is an optional function that will get run before the actual computation runs. We can use this function to compute values for which we are interested in, such as ignoring outliers in data.

sum()

We can get the sum of the array using the d3.sum(array) function, which returns 0 for an empty array. The sum() function ignores invalid values such as NaN and undefined.

1 var arr = [1,28,9,4];

2 d3.sum(arr)

3 // => 42

4 var obj = [

5 {x: 1, y: 20},

6 {x: 28, y: 5},

7 {x: 9, y: 5},

8 {x: 4, y: 2}

9 ]

10 d3.sum(obj, function(d) { return d.x; })

11 // => 42

min()/max()

We can pick out the minimum and the maximum of the array by using the D3 min() and max() functions. These functions are both different from their built-in Math counterparts in that they ignore undefined values.

They also compare using natural ordering, rather than using numeric ordering. This means that the maximum of [“20”, “3”] is “3”, while [20, 3] the maximum is 20. The same is true for the minimum, but in reverse.

1 var arr = [1,28,9,4];

2 d3.min(arr)

3 // => 1

4 d3.max(arr)

5 // => 28

6 var obj = [

7 {x: 1, y: 20},

8 {x: 28, y: 5},

9 {x: 9, y: 5},

10 {x: 4, y: 2}

11 ]

12 d3.min(obj, function(d) { return d.x; })

13 // => 1

14 d3.max(obj, function(d) { return d.x; })

15 // => 28

mean()

The mean() function returns us the mean of a given array. The mean is the average of the values in the array, or the central value of the array.

1 var arr = [1,28,9,4];

2 d3.mean(arr);

3 // => 10.5

4 var obj = [

5 {x: 1, y: 20},

6 {x: 28, y: 5},

7 {x: 9, y: 5},

8 {x: 4, y: 2}

9 ]

10 d3.mean(obj, function(d) { return d.x; })

11 // => 10.5

median()

The median() function returns the median of the given array using the R-7 algorithm. The median is the “middle” of the array.

1 var arr = [1,28,9,4];

2 d3.median(arr);

3 // => 6.5

4 var obj = [

5 {x: 1, y: 20},

6 {x: 28, y: 5},

7 {x: 9, y: 5},

8 {x: 4, y: 2}

9 ]

10 d3.median(obj, function(d) { return d.x; })

11 // => 6.5

There are a lot of other simple array helper functions available through D3. For more functions and deeper descriptions of their functionality, check out the documentation available at https://github.com/mbostock/d3/wiki/Arrays.

Associative array helpers

Associative arrays are like simple objects. They have a set of named properties. D3 offers a few helpers for converting these maps into standard arrays, which are generally more useful in visualizations.

keys()

The keys() method returns an array of the key names of the specific object.

1 d3.keys({x: 1, y: 2, z: 3})

2 // => ["x", "y", "z"]

values()

The values() method returns an array containing all of the values of a specific object.

1 d3.values({x: 1, y: 2, z: 3})

2 // => [1, 2, 3]

entries()

Sometimes we want the specific array to contain both the key name and the value name. This is particularly useful when we’re interested in dynamic objects where we are interested in a single key by name.

1 d3.entries({x: 1, y: 2, z: 3})

2 // => [

3 // {key: "x", value: 1},

4 // {key: "y", value: 2},

5 // {key: "z", value: 3}

6 // ]

Maps

Maps are similar to objects in JavaScript, but when using any built-in object behavior and/or built-in keys (such as __proto__ or hasOwnProperty), the object behavior will tend to act unexpectedly. For more, see An Object is not a Hash by Guillermo Rauch

In D3, we can create a new map object simply by call the d3.map() function. It can also take an optional first argument that contains an object of keys and values to initialize the map with.

1 var m1 = d3.map(); // map object

2 var m2 = d3.map({x: 1}); // map object with x as a key and 1 as the value

map.has()

We can test if a map includes a key or not using the map.has() method. It takes a single argument of a key string.

1 var m = d3.map({x: 1})

2 m.has("x"); // true

3 m.has("y"); // false

4 var key = "x";

5 m.has(key); // true

map.get()

We can fetch the value of a key inside of our map by using the map.get() function. It takes a single argument of a key string to fetch.

1 var m = d3.map({x: 1})

2 m.get("x"); // 1

3 m.get("y"); // undefined

map.set()

We can set a value inside of our map as well using the map.set() method. This takes two arguments, the key (string) to set and the value to set it. If the map already knows about the key, it will replace the old value with the new value.

1 var m = d3.map({x: 1})

2 m.set("y", 2);

3 m.set("x", 2);

4 // m => {x: 2, y: 2}

map.remove()

We can also delete a key that is in our map by using the map.remove() function. This takes a single argument, the key (string) to remove out of the map.

1 var m = d3.map({x: 1})

2 m.set("y", 2);

3 m.remove("x");

4 // m => {y: 2}

map.keys()

We can fetch an array of all the keys in the map by using the map.keys() function.

1 var m = d3.map({x: 1})

2 m.set("y", 2);

3 m.keys(); // => ["x", "y"]

map.values()

Similarly, we can fetch an array of all the values in the map using the map.values() function.

1 var m = d3.map({x: 1})

2 m.set("y", 2);

3 m.values(); // => [1,2]

map.entries()

We can also return an array of key-value based objects with two keys of key and value. This is useful for picking out certain objects inside of a map.

1 var m = d3.map({x: 1})

2 m.set("y", 2);

3 m.entries();

4 // [{key:"x", value:1}, {key:"y",value:2}]

map.forEach()

Finally, we can iterate over every entry in the map by using the map.forEach() function. This takes a single argument of a function that will be called for every single entry in the map. This function will be called with two arguments, the key and the value like so:

1 var m = d3.map({x: 1})

2 m.set("y", 2);

3 m.forEach(function(key, value) {

4 console.log(value);

5 });

6 // 1

7 // 2

information

The this context inside of the function points to the map itself

Sets

Sets are very similar to maps, with the exception that they can only have a single value associated in the set.

We can create a set by calling the d3.set() method. This method takes a single, optional argument of an array.

If an array is passed in as an argument, every value in the array will be returned as a member of the set.

1 var s = d3.set([1,2,"foo","bar"])

2 // s = [1,2,"foo","bar"]

set.has()

We can test if the set has the value set as an entry by using the set.has() method. This method takes a single argument of a value. The method will return true if and only if the set has an entry for the value string.

1 var s = d3.set([1,2,"foo","bar"])

2 s.has(1); // true

3 s.has("donald_duck"); // false

set.add()

We can add a value to the set by using the set.add() function. It takes a single argument of value, the value to append to the set.

1 var s = d3.set([1,2,"foo","bar"])

2 s.add(3);

3 s.has(3); // true

set.remove()

We can remove a value using the set.remove() method. This remove method takes a single argument of the value to remove from the set. If the value is in the set, it will remove the value (and return true). If the value is not in the array, then it will do nothing and return false.

1 var s = d3.set([1,2,"foo","bar"])

2 s.remove(1); // true

3 s.remove("space"); // false

4 s.has(1); // false

set.values()

Just like in arrays, we can get all of the values in the set. This will return an array that includes all of the “unique” values in the set.

1 d3.set([1,2,1,3]).values(); // ["1","2","3"]

set.forEach()

We can run a function for every single value in our set as well by using the set.forEach() function. The set.forEach(fn) method takes a single argument of the function to run for every element.

The function will be run with a single argument of the value in the set.

1 var s = d3.set([1,2,"foo","bar"])

2 s.forEach(function(val) {

3 console.log("Value: " + val);

4 });

5 // Value: 1

6 // Value: 2

7 // Value: foo

8 // Value: bar

information

The this context inside the function points to the set itself.

Nests

A nest allows elements in an array to be grouped into a hierarchical tree structure. Nests allows us to neatly format our data into meaningful grouping contexts.

Nests allow us to format data in groupings that makes sense for our use. For instance, if we have the population of countries in the world, the continent they are on and their population, it would be useful for us to be able to view the population based upon continent.

We can set up our data in a nest and then populate the nest data using a key of the data:

1 var raw_country_data = [{

2 "countryName": "Andorra",

3 "continent": "EU",

4 "languages": "ca",

5 "areaInSqKm": "468.0",

6 "population": "84000"

7 },

8 {"countryName": "United Arab Emirates",

9 "continent": "AS",

10 "languages": "ar-AE,fa,en,hi,ur",

11 "areaInSqKm": "82880.0",

12 "population": "4975593"

13 },

14 {"countryName": "Antigua and Barbuda",

15 "continent": "NA",

16 "languages": "en-AG",

17 "areaInSqKm": "443.0",

18 "population": "86754"

19 },

20 {"countryName": "Anguilla",

21 "continent": "NA",

22 "languages": "en-AI",

23 "areaInSqKm": "102.0",

24 "population": "13254"

25 },

26 {"countryName": "Albania",

27 "continent": "EU",

28 "languages": "sq,el",

29 "areaInSqKm": "28748.0",

30 "population": "2986952"

31 },

32 {"countryName": "Armenia",

33 "continent": "AS",

34 "languages": "hy",

35 "areaInSqKm": "29800.0",

36 "population": "2968000"

37 }

38 ]

39 var data = d3.nest()

40 .key(function(d) { return d.continent })

41 .sortKeys(d3.ascending)

42 .entries(raw_country_data);

43 // [

44 // key: "AS", values: [

45 // {continent: "AS", ...

46 // ]

47 // ]

The d3.nest() function creates a new nest operator. It takes no arguments, but allows us to call all nests functions on it.

nest.key()

To set up the key functions, we’ll use the nest.key() method. This method takes a single function that will be invoked for each element in the input array and it is expected to return a string identifier that’s used to assign the element to its group.

In the above example, we set up the key() function to return the string identified by the d.continent accessor. These are usually keys to set up accessors.

We can create multiple keys functions to set up multiple accessors (which will result in an additional hierarchy level).

1 var nest = d3.nest()

2 .key(function(d) { return d.continent })

3 .key(function(d) { return d.languages });

nest.sortKeys()

Often times we’ll want to sort the entries so that we’ll look at values ascending or descending. In the case of our example from above, we are sorting on the countryName, so that we get an alphabetical listing of our countries in our data.

1 var nest = d3.nest()

2 .key(function(d) { return d.continent })

3 .sortKeys(d3.ascending);

nest.sortValues()

We can also sort the values of our elements using the nest.sortValues() function. This is similar to sorting the input array before applying the nest operator, except that it tends to be more efficient as the groupings tend to be much smaller than the entire dataset.

1 var nest = d3.nest()

2 .key(function(d) { return d.continent })

3 .sortValues(d3.ascending);

nest.rollup()

We can rollup data that we are interested in, rather than simply returning the raw sorted data.

For instance, if we want to return the area in square miles for each continent that we know about, we can create a rollup function using the nest.rollup(function). The method takes a single argument that will be called on each of the elements.

1 var data = d3.nest()

2 .key(function(d) { return d.continent })

3 .sortKeys(d3.ascending)

4 .rollup(function(d) {

5 return {

6 area: d3.sum(d, function(g) {

7 return +g.areaInSqKm;

8 })

9 }

10 })

11 .entries(raw_country_data);

12 // [{

13 // "key":"AS",

14 // "values":{

15 // "area":112680

16 // }

17 // }, {

18 // "key":"EU",

19 // "values":{

20 // "area":29216

21 // }

22 // }, {

23 // "key":"NA",

24 // "values":{

25 // "area":545

26 // }

27 // }]

nest.map()

We can apply the nest operator to a specific array using the nest.map() function. This will allow us to create a map of the data, rather than simply returning an array (as we’ve seen above).

The map() method takes up to two arguments. It accepts the array, which is the input array whose data will be applied in the nest. It can also take an optional mapType function. If this is specified, then it can return a different type, such as a d3.map, rather than a simple object.

1 var data = d3.nest()

2 .key(function(d) { return d.continent })

3 .sortKeys(d3.ascending)

4 .rollup(function(d) {

5 return {

6 area: d3.sum(d, function(g) {

7 return +g.areaInSqKm;

8 })

9 }

10 })

11 .map(raw_country_data);

12 // {

13 // AS: { area: 112680 },

14 // EU: { area: 29216 },

15 // NA: { area: 545 }

16 // }

nest.entries

As we’ve done in the above examples, we can apply data to our nest using the nest.entries(arr) function. This simply takes a single array as it’s input to pull data for the nest.

The entries() function runs on all levels of the hierarchy.

1 var data = d3.nest()

2 .key(function(d) { return d.continent })

3 .sortKeys(d3.ascending)

4 .entries(raw_country_data);

Applying our knowledge

We’ll combine all we’ve learned so far to recreate the iconic album cover of Unknown Pleasures by Joy Division using just d3.range, array.map and the line generator d3.svg.line().

1 <!DOCTYPE HTML>

2 <html>

3 <head>

4 <script src="http://d3js.org/d3.v3.min.js" charset="utf-8"></script>

5 <style>

6 /* make paths 2px wide and white, and give them a black fill. */

7 path { stroke: white; stroke-width: 2; fill: black; }

8 /* make the svg and body black */

9 body, svg { background-color: black; }

10 </style>

11 </head>

12 <body>

13 <script>

14 var svg = d3.select('body').append('svg');

15 // generate random data that squiggles a lot in the middle

16 var data = d3.range(50, 650, 10).map(function(y_offset){

17 // generate a line of the form [[6,7],[22,34],...]

18 return d3.range(100, 700, 10).map(function(d){

19 var y = d;

20 if(y < 300 || y > 500) y = 50; else y = 500;

21 return [d, y_offset - Math.random() * Math.random() * y / 10];

22 });

23 });

24 // create our line generator

25 var line = d3.svg.line()

26 .x(function(d) { return d[0]; })

27 .y(function(d) { return d[1]; })

28 .interpolate("basis");

29 // create several path elements using our line generator and squiggle data

30 svg.selectAll('path').data(data).enter()

31 .append('path').attr('d', line);

32 </script>

33 </body>

34 </html>

Scales

As we build our datasets and start to try to apply them to our visual canvas, we’ll need to set up scales so that our data can sit nicely on our page.

It’s very unlikely that any of the data that we’ll work with will ever translate to our pixel coordinates in our visualizations. We’ll want to scale our data so that our data reflects the relationships we have to our pixel data.

For this purpose, D3 includes a convenient helper object that we’ll use to create a scaled relationship between our data and the relative pixel coordinates to represent that data.

When we talk about scales, we’re talking about relative data points that we can use to map to data relative to other points.

Why scales

For instance, what if wanted to plot the number of viewers for each episode of the third season of the AMC television show, The Walking Dead with a bar chart?

Each bar chart would not stop until it was way outside of the visible screen space. We just don’t have 10 million pixels, one to represent each viewer!

One solution would be to step through and apply a scaling factor to each data point so that the maximum point wouldn’t go past the top of are chart.

1 <!DOCTYPE HTML>

2 <html>

3 <head>

4 <script src="http://d3js.org/d3.v3.min.js"

5 charset="utf-8"></script>

6 <style> rect { stroke: red; stroke-width: 1; fill: black; } </style>

7 </head>

8 <body>

9 <script>

10 // viewers for each episode

11 var data = [ 10870000, 9550000, 10510000,

12 9270000, 10370000, 9210000,

13 10430000, 10480000, 12260000,

14 11050000, 11010000, 11300000,

15 11460000, 10840000, 10990000,

16 12420000 ];

17

18 // scaling data

19 var height = 200;

20 // the episode with the most views

21 var max = d3.max(data);

22 // Our scale factor

23 var scale = height / max;

24 var scaled_data =

25 data.map(function(d){

26 return d * scale;

27 });

28

29 var svg = d3.select('body')

30 .append('svg')

31 .selectAll('rect')

32 .data(scaled_data)

33 // add all the <rect> tags

34 .enter().append('rect')

35 .attr({

36 width: 50,

37 height: function(d){ return d },

38 x: function(d, i){ return i * 50 },

39 // y pixels count down

40 y: function(d){ return height - d }

41 });

42 </script>

43 </body>

44 </html>

Not bad, but we have to take into account a lot of extra manual work. Now, we have to worry about this extra array scaled_data.

We also joined our data with this scaled_data array instead of the original, which makes it difficult to add features to the visualization, like adding tool-tips to each bar chart.

Our data is also bunched up at the top so it’s not easy to see the differences. These differences would be more apparent if the data was plotted from the minimum episode to the highest, to emphases these differences.

1 // scaling data

2 var height = 200;

3 var min = d3.min(data);

4 var range = d3.max(data) - min;

5 var scale = height / range;

6 var scaled_data = data.map(function(d){

7 return (d - min) * scale;

8 });

That looks a lot better, but took more code than it needed and has the side effect of overriding our original data.

D3’s linear scale generator eliminates both these corners by giving us a function we can call anytime we want to convert an input range (viewers in our case) to an output domain (most often, pixels.) Scales, essentially allow us to describe the relationship between our data and pixels.

This allows us to be expressive about the range of possible input data values and limit the possible output domain to translate the data into a new domain.

Creating a scale

We can easily create a scale of different types by using functions given to us by the d3.scale object. To create a linear scale, we’ll use the d3.scale.linear() function:

1 var scale = d3.scale.linear();

This scale, although relatively useless will map input values to output values at a 1:1 dimension. That is, calling it with 2, we’ll get an output of 2.

We have not actually applied any domains or ranges on this scale object, so it only assumes 1:1.

Setting input domain

We can set an input domain by using the .domain() function.

1 var scale = d3.scale.linear()

2 .domain([0, 100]);

This will set the input domain to the start from 0 and go to 100. This will set the minimum number of our domain to be at 0 and our max to scale to 100.

information

If we don’t pass numbers into the array, then the values will be coerced into numbers. For instance, this happens with Date objects (although D3 does have a d3.time.scale object we can use for dates which is often more convenient).

Often times, we’ll want the input domain to match our dataset. Rather than set the domain values manually, more often than not, we’ll set these using the min and max of our dataset.

We can do this using the d3.min() and d3.max() functions or by using the d3.extent() function.

1 var color = d3.scale.linear()

2 // Using straight-forward min and

3 // max with two functions

4 .domain([d3.min(data), d3.max(data)])

5 // OR we can use d3.extent() function:

6 .domain(d3.extent(data));

Setting output range

We can set the output range by using the .range([values]) function:

1 var scale = d3.scale.linear()

2 .domain([0, 1])

3 .range([10, 100]);

The range() function sets the output range to span between 10 and 100 for output values. It accepts a single array of values to match the cardinality of the input domain. That is, the same amount of numbers as the domain.

One feature of the range() function is that it does not need to be numeric inputs, so long as the underlying interpolation (depending upon which type of scale we’re working with, i.e. linear, time, etc) supports it.

This enables us to scale between colors, for instance.

1 var color = d3.scale.linear()

2 .domain([0, 100])

3 .range(["red", "blue"]);

4 color(0); // -> "#ff0000" aka, red

5 color(50); // -> "#800080" aka, purple

6 color(100); // -> "#0000ff" aka, blue

Polylinear scales

With linear scales, we’re not limited to just two values. This allows us to connected domains to connected ranges. This is as if we had several individual linear scales connected. This most useful for dealing with multi color gradients.

1 var scale = d3.scale.linear()

2 .domain([0, 0.5, 1])

3 .range(['blue', 'white', 'red']);

warning

A common mistake when creating color gradients like this is to forget to include the same number of domain elements as range colors. In the example above, if we gave domain() an array of length two (ie., [0, 1]) instead of 3 (ie., [0, 0.5, 1]) our range output would only go from blue to white.

Using our scale

When it comes time to translate our data to pixels, we can just use scale(d)

For instance, let’s say that our domain ranges from 0 to 2 that we want to map on to a chart of between 0 to 500.

1 var scale = d3.scale.linear()

2 .domain([0, 2])

3 .range([0, 500]);

If we pass in 0 to our resulting scale() function, we’ll get an output value of 0. If we pass in 2, then we will get our maximum output of 500.

1 scale(0); // 0

2 scale(2); // 500

We can use this scale to provide a gradient between our values of 0 and 2. For instance:

1 scale(0.5); // 125

2 scale(1.1); // 275

3 scale(1.9); // 475

As we can see, our scales output relative ranges for each datum. This process is called normalization, which is mapping a numeric value to all possible minimum and maximum values on a given scale.

information

If it takes 100 licks to get to the center of a tootsie roll, then 90 licks in and we are at 90% of the way done, while at 10 licks we’re only at 10%. This is called normalization.

Back to our mapping of viewership, we can create a scale that maps on to our entire dataset:

1 var scale = d3.scale.linear()

2 .domain(d3.extent(data)])

3 .range([0, 200]);

This will map the entire range of our input values from the minimum values to the maximum viewership values.

We can now use this scale to set the height of our bar chart:

1 // count up from height,

2 // since y counts down

3 svg.selectAll('rect')

4 .data(data).enter()

5 .append('rect')

6 .attr('y', function(d){

7 return height - scale(d);

8 });

Scales can be used to map our data to pixels

Scales can be used to map our data to pixels

We’ve covered how to use the d3.scale.linear() scale in this chapter, but there are quite a few more options for us in terms of using scales when mapping input domain data to output ranges.

Other non-linear scales

D3 includes other scales than simply linear mappings. For instance, we can create a logarithmic scale using the d3.scale.log() function.

We can create a new power scale by using the d3.scale.pow() method.

The logarithmic scale is useful for translating data that increases exponentially. The power scale, like the logarithmic scale is useful for translating data that increases exponentially. These two scales output format are slightly different in their computation of the output range.

All of these scales work very similar to each other, with slight differences. Now that we know how they work, we can examine the D3 documentation for more detailed explanation of each of the functions.

Axes

In any visualization where we’re trying to convey meaning, having the viewer understand the scale of data in comparison to other data is necessary. One easy way we can simply convey this type of data is by using Axes in our visualizations.

D3 provides fantastic helper functions that help us construct the necessary annotations that combine to form an axis, such as tick marks, labels and reference lines.

As long as we properly create our scale and use it to build our axis, D3 can take care of constructing the building function.

Creating an axis

To create an axis, we can use the d3.svg.axis() method, like so:

1 var axis = d3.svg.axis();

That’s it. We’ll discuss ways to customize this shortly.

We’ll only need to add the axis to our SVG element. It’s possible to call our new axis() function directly, ie., axis(svg).

Although this does technically work, it doesn’t allow us to chain our methods together, like the rest of D3.

To support chaining, we’ll instead use the form svg.call(axis) to create the axis on our svg:

1 <!DOCTYPE html>

2 <html>

3 <body>

4 <script

5 src="http://d3js.org/d3.v3.min.js"

6 charset="utf-8"></script>

7 <script>

8 var width = 800;

9 // the heights of the worlds latest buildings, in ft.

10 var data = [2717, 2073, 1971,

11 1776, 1670];

12 // create our svg and add it

13 // to the `<body>`

14 var svg = d3.select('body')

15 .append('svg');

16 // create a scale that goes from

17 // [0, max(data)] -> [10, width]

18 var scale = d3.scale.linear()

19 .domain([0, d3.max(data)])

20 .range([10, width]);

21 // create a new axis that

22 // uses this new scale

23 var axis = d3.svg.axis()

24 .scale(scale);

25 // add the new axis to the svg.

26 // same as `axis(svg);` except

27 // that it returns `svg`

28 svg.call(axis);

29 </script>

30 </body>

31 </html>

hello axis

hello axis

The resulting DOM after d3 creates the axis on our svg.

1 <!-- the resulting SVG -->

2 <svg>

3 <!-- first tick mark and label -->

4 <g class="tick"

5 transform="translate(10,0)"

6 style="opacity: 1;">

7

8 <line y2="6" x2="0"/>

9 <text y="9" x="0" dy=".71em"

10 style="text-anchor: middle;">0</text>

11 </g>

12 <!-- second tick mark and label -->

13 <g class="tick"

14 transform='translate(68.15237394184763,0)'

15 style="opacity: 1;">

16 <line y2="6" x2="0"/>

17 <text y="9" x="0" dy=".71em"

18 style="text-anchor: middle;">

19 200

20 </text>

21 </g>

22 <!-- ...etc... -->

23 <path class="domain" d="M10,6V0H800V6"/>

24 </svg>

error

Live version: http://jsbin.com/puwecuhi/1/edit

It’s fairly common to add additional styles for the axis as the default axis isn’t very attractive. To do this, we’ll add a class to a containing <g> element which we’ll then add the axis component to.

1 svg.append('g').call(axis).attr('class', 'x axis');

1 .x.axis path, .x.axis line{

2 fill: none;

3 stroke: black;

4 }

axis with style

axis with style

error

Live version: http://jsbin.com/zopalifo/1/edit

Now lets add some data points, to represent the heights of the world’s top 5 tallest buildings. They’ll be circles of radius, r=4, positioned at their respective locations along the axis. We’ll also translate the down 100 pixels so they don’t get clipped off the edge of the screen.

1 var height = 100;

2 // move the axis down 100 pixels

3 svg.append('g').call(axis).attr('class', 'x axis')

4 .attr('transform', 'translate(0,' + height + ')');

1 // add some circles along the axis that present the building heights

2 svg.selectAll('circle').data(data)

3 .enter().append('circle').attr('r', 4).attr('transform', function(d){

4 return 'translate(' + scale(d) + ', ' + height + ')'

5 });

add some data points

add some data points

error

Live version: http://jsbin.com/diburebi/1/edit

Now how about we add a few labels? We’ll need more data for that. Specifically, we’ll need to relate building heights to building names.

1 var data = [

2 [ "Burj Khalifa" , 2717],

3 [ "Shanghai Tower" , 2073],

4 [ "Makkah Royal Clock Tower Hotel", 1971],

5 [ "One World Trade Center" , 1776],

6 [ "Taipei 101" , 1670]

7 ];

This is the data we want but adding it directly into our existing project, without making any other changes, will break our code. This is because the position of the circles depends on the data being a flat array of integers, as well as our call to d3.max(). We’ll need to change scale(d) toscale(d[1]) since each data item is now itself an array, with the second element the height of the building. Similarly when we specify our scales domain, we’ll need to give d3.max an accessor function, so it knows what part of our data item (datum) should be used to find the max of. In our case, this is the second element of our array.

1 var scale = d3.scale.linear()

2 .domain([0, d3.max(data, function(d){ return d[1]; })])

3 .range([10, width]);

Now that our code is working again, let’s add those labels. Since each element of the original data array is now an array itself, we’ll use d[0] to reference the building name and scale(d[1]) to map the building height to the x position in pixels along the axis. We’ll also give it a slight counter-clockwise rotation using rotate(-20) so the labels don’t overlap. Finally, we’ll apply one last translation, translate(5,-5), so the label isn’t bumped up against the data point.

1 // add the labels

2 svg.selectAll('text').data(data)

3 .enter().append('text').text(function(d){ return d[0] })

4 .attr('transform', function(d){

5 return 'translate(' + scale(d[1]) + ', ' + height + ') '

6 + 'rotate(-20) translate(5,-5)';

7 });

Lets also format the axis labels using the axis tickFormat() method so people can tell what units we’re using. D3 offers a variety of different formatters we can use to pass into tickFormat for our axis. If you’re interested in reading about these, check out d3.format() and d3.time.format(). Our formatter, d3.format(',.0f'), rounds off decimals and add commas. ie., d3.format(',.0f')(1000.04) === "1,000".

1 // create our axis and format the ticks

2 var axis = d3.svg.axis().scale(scale)

3 .tickFormat(function(d){ return d3.format(',.0f')(d) + 'ft' }).ticks(5);

As a last touch, say we decide there’s too much empty space before the start of the first data point. Lets update our scale to instead start at the first data point instead of zero. That change is as simple as changing our scales domain from [0, d3.max(...)] to d3.extent(data, ...)

1 var scale = d3.scale.linear()

2 // .domain([0, d3.max(data, function(d){ return d[1]; })])

3 .domain(d3.extent(data, function(d){ return d[1]; }))

4 .range([10, width]);

Here’s what the final version looks like:

worlds tallest buildings

worlds tallest buildings

1 <!DOCTYPE html>

2 <html>

3 <head>

4 <script src="http://d3js.org/d3.v3.min.js" charset="utf-8"></script>

5 <style>

6 .x.axis path, .x.axis line{

7 fill: none; stroke: black;

8 }

9 </style>

10 </head>

11 <body>

12 <script>

13 var width = 700, height = 100;

14 var svg = d3.select('body').append('svg');

15 // worlds latest buildings

16 var data = [

17 ["Burj Khalifa" , 2717],

18 ["Shanghai Tower" , 2073],

19 ["Makkah Royal Clock Tower Hotel", 1971],

20 ["One World Trade Center" , 1776],

21 ["Taipei 101" , 1670]

22 ];

23 var scale = d3.scale.linear()

24 .domain(d3.extent(data, function(d){ return d[1]; }))

25 .range([10, width]);

26 // add the data points

27 svg.selectAll('circle').data(data)

28 .enter().append('circle').attr('r', 4).attr('transform', function(d){

29 return 'translate(' + scale(d[1]) + ', ' + height + ')'

30 });

31 // add the labels

32 svg.selectAll('text').data(data)

33 .enter().append('text').text(function(d){ return d[0] })

34 .attr('transform', function(d){

35 return 'translate(' + scale(d[1]) + ', ' + height + ') '

36 + 'rotate(-20) translate(5,-5)';

37 });

38 // create the axis

39 var axis = d3.svg.axis().scale(scale)

40 .tickFormat(function(d){ return d3.format(',.0f')(d) + 'ft' }).ticks(5);

41 // add the axis inside a new `g`

42 svg.append('g').call(axis).attr('class', 'x axis')

43 .attr('transform', 'translate(0,' + height + ')');

44 </script>

45 </body>

46 </html>

error

Live version: http://jsbin.com/wuji