Data Parsing and Formatting - D3 on AngularJS: Create Dynamic Visualizations with AngularJS (2014)

D3 on AngularJS: Create Dynamic Visualizations with AngularJS (2014)

Data Parsing and Formatting

This chapter covers some of the common ways to get and clean data for use with D3.

Getting data into the browser

All the examples up until this point used hard-coded data values directly in our code. This is fine when working with very small, unchanging, data sets but quickly becomes unmanageable when the amount of data increases or our data changes frequently. We don’t want to have to keep copying and pasting data into our code. This problem can be solved with a few D3 helper functions that allow us to load data in different formats from a separate files, instead of being in our code. Specifically, we can use d3.json() to load JSON formatted data, or d3.csv() to load CSV data.

All we need to give each is the name of the file to load, and a callback to be called once the file has been loaded.

Loading CSV data


1 Title, Air date, U.S. viewers (in millions)

2 "Winter Is Coming", "April 17, - 2011", 2.22

3 "The Kingsroad", "April 24, - 2011", 2.20

4 "Lord Snow", "May 1, - 2011", 2.44


1 d3.csv('episodes.csv', function(err, data){

2 if(err) throw err;

3 console.log(data[0].Title); // => "Winter is Coming"

4 console.log(data[1].Title); // -> "The kingsroad"

5 // etc...

6 });

Loading JSON data


1 [

2 {

3 "Title": "Winter Is Coming",

4 "Air date": "April 27, - 2011",

5 "U.S. viewers (in millions)": 2.22

6 },{

7 "Title": "The Kingsroad",

8 "Air date": "April 24, - 2011",

9 "U.S. viewers (in millions)": 2.20

10 },{

11 "Title": "Lord Snow",

12 "Air date": "May 1, - 2011",

13 "U.S. viewers (in millions)": 2.44

14 }

15 ]

1 d3.json('episodes.json', function(err, data){

2 if(err) throw err;

3 console.log(data[0].Title); // => "Winter is Coming"

4 console.log(data[1].Title); // -> "The kingsroad"

5 // etc...

6 });


Common Gotcha: Loading data from local file

Most modern browsers impose security restrictions that prevent HTML files loaded from your computer from making requests for other local file resources. This means if you’ve just been double clicking on HTML files and opening them in your browser, those HTML pages wont be able to make requests for our locally stored data files. This security restriction prevents normal browser users from mistakingly opening a downloaded HTML file and having it snoop other files off of their hard drive. To circumvent this restriction, the simplest solution is to run a web server using python -m SimpleHTTPServer from the command line in the directory from which you want to run your server. Then we can access our-file.html in that directory fromlocalhost:8080/our-file.html in the browser.


Common Gotcha: JSON is a subset of JavaScript

JSON is a subset of JavaScript. This means we could copy and paste JSON into a JavaScript program without problems, but this does not mean we can copy and paste anything from JavaScript into JSON. One key difference is that JSON object keys must be in double quotes. The same is true for string values. ie., { foo : ‘bar’ } is valid JavaScript but not valid JSON. { “foo” : “bar” } is valid JSON and valid JavaScript. Valid JSON cannot contain function or Date objects.

Working with data

Often the hardest part of data visualization is just getting the data we want to visualize into a meaningful format. Notice how in the above example the dates are formatted as strings and contain a -? We’ll have a hard time using those dates in scales or graphing them along a time line if we can’t work with them as numbers. To convert them cover to numbers, we can use D3’s time formatting helper method, d3.time.parse(). With it, we can create a new date format object to parse our funky dates.

1 var format = d3.time.format('%B %e, - %Y');

2 format.parse("April 27, - 2011"); // => a new date object

There’s a ton of different date formatting tokens you can use. Checkout the full list on the D3 github wiki page.

To reformat our data set with so it contains actual date objects, we can call our formatter on each “Air date” property of every element in our data array. In this step, we’ll also create a viewers property instead of viewers (in millions) since it’s best practice to keep our data in the most common and/or obvious units by default.

1 var format = d3.time.format('%B %e, - %Y');

2 data.forEach(function(datum){

3 datum['Air date'] = format.parse(datum['Air date']);

4 // add another property, `viewers`

5 datum['U.S. Viewers'] = datum['viewers (in millions)'] * 1000000;

6 // remove the `U.S. viewers (in millions)` property

7 delete datum['U.S. viewers (in millions)'];

8 });