English |  Español |  Français |  Italiano |  Português |  Русский |  Shqip

Developing a D3.js Edge

7. Data Manager API

  • Introduce the data manager
  • Build a reusable component not bound to any visual element
  • Discuss custom events

The Data Manager


 The source code is available in the code/Chapter07/. 


For our example, let's introduce the concept of a data manager. The data manager will be responsible for loading the data into the browser, applying any post-loading cleaning of the data that may be required, filtering the data based upon user interaction, and letting us know when the data is ready. The data manager also serves to demonstrate that the reusable concept can be applied to more than just graphs. It can also be used for any number of tasks that will be repeated throughout the visualization.

For our application, all of our reusable modules are namespaced under:

001:var d3Edge = {};

This will ensure that none of the modules that are introduced will conflict with other linked libraries.

To start, our first task is to setup the data manager module:

001:d3Edge.dataManager = function module() {
002:    var exports = {},
003:        dispatch = d3.dispatch('geoReady','dataReady','dataLoading'),
004:        data;
005:    d3.rebind(exports, dispatch, 'on');
006:    return exports;
007:};

Here, we have created a function under the d3Edge namespace called: dataManager. This function will return an object called exports that will contain various methods defined later. To instantiate this module in our application, simply assign the function to a new variable:

001:var zurichDataManager = d3Edge.dataManager();

In addition, we added three custom events to this module that will give us some indication of when various events in the data loading process are occurring. To accomplish this, we are using d3.dispatch to create three events: geoReady, dataReady, and dataLoading. These events are bound to the 'on' method of the exports object using d3.rebind. We can fire these events anywhere within our module at a time of our choosing by calling the dispatch function together with the name of the event. For instance, for the dataReady event:

001:dispatch.dataReady();

We can then listen for the event on the instantiated module like so:

001:zurichDataManager.on('dataReady'function(data) {
002:    // Do Something With The Data Here.
003:});

This sets the basic framework up for the data manager module. Now we need to create methods on our exports object that can be used by our instantiated zurichDataManager. Our first method will be to load the CSV file from the server, apply a cleaning function, and fire an event to indicate that the data is ready. In addition, we will create a simple getter method that will allow us to access the cleaned data at anytime.

001:exports.loadCsvData = function(_file, _cleaningFunc) {
002:    // Create the csv request using d3.csv.
003:    var loadCsv = d3.csv(_file);
004:    // On the progress event, dispatch the custom dataLoading event.
005:    loadCsv.on('progress'function() {
006:        dispatch.dataLoading(d3.event.loaded);
007:    });
008:    // Issue a HTTP get for the csv file, and work with the data.
009:    loadCsv.get(function (_err, _response) {
010:        // Apply the cleaning function supplied in the _cleaningFunc parameter.
011:        _response.forEach(function (d) {
012:            _cleaningFunc(d);
013:        });
014:        // Assign the cleaned response to our data variable.
015:        data =_response;
016:        // Dispatch our custom dataReady event passing in the cleaned data.
017:        dispatch.dataReady(_response);
018:    });
019:};
020://Create a method to access the cleaned data.
021:exports.getCleanedData = function () {
022:    return data;
023:};

We first create our method loadCsvData by defining it on the exports object. This method accepts two parameters: _file and _cleaningFunc. This allows us to easily pass a different file path and cleaning function for each instantiation of the module. This is important as our data sets do not always contain the same column headers, and the cleaning function must be modified to fit each data set. Once the cleaning function is asynchronously applied to the response, we assign the result to our variable data. Our method getCleanData will simply return the variable data when invoked, thus providing easy access to the cleaned data for use or inspection.

For the three data sets we instantiate and load the data like so:

001://Instantiate our data manager module for each city.
002:var sanFranciscoDataManager = d3Edge.dataManager(),
003:    zurichDataManager = d3Edge.dataManager(),
004:    genevaDataManager = d3Edge.dataManager();
005:
006://Load our Zurich data, and supply the cleaning function.
007:zurichDataManager.loadCsvData('../__data/zurich/zurich_delay.csv'function(d){
008:  var timeFormat = d3.time.format('%Y-%m-%d %H:%M:%S %p');
009:  d.DELAY = +d.DELAY_MIN;
010:  delete d.DELAY_MIN;
011:  d.SCHEDULED = timeFormat.parse(d.SCHEDULED);
012:  d.LATITUDE = +d.LATITUDE;
013:  d.LONGITUDE = +d.LONGITUDE;
014:  d.LOCATION = [d.LONGITUDE, d.LATITUDE];
015:});
016:
017://Load our Geneva data, and supply the cleaning function.
018:genevaDataManager.loadCsvData('../__data/geneva/geneva_delay_coord.csv'function(d){
019:  var timeFormat = d3.time.format('%Y-%m-%d %H:%M:%S %p');
020:  d.DELAY = +d.DELAY;
021:  d.SCHEDULED = timeFormat.parse(d.SCHEDULED);
022:  d.LATITUDE = +d.LATITUDE;
023:  d.LONGITUDE = +d.LONGITUDE;
024:  d.LOCATION = [d.LONGITUDE, d.LATITUDE];
025:});
026:
027://Load our San Francisco data, and supply the cleaning function.
028:sanFranciscoDataManager.loadCsvData('../__data/san_francisco/san_francisco_delay.csv'function(d){
029:  var timeFormat = d3.time.format('%Y-%m-%d %H:%M:%S %p');
030:  d.DELAY = +d.DELAY_MIN;
031:  delete d.DELAY_MIN;
032:  d.SCHEDULED = timeFormat.parse(d.SCHEDULED);
033:  d.LATITUDE = +d.LATITUDE;
034:  d.LONGITUDE = +d.LONGITUDE;
035:  d.LOCATION = [d.LONGITUDE, d.LATITUDE];
036:});

In the code above we first instantiate our reusable data manager for each city. We then load and clean the data for each city by passing in the URL for the CSV, along with the cleaning function for the respective cities. For each data file the headers are slightly different, thus complicating the cleaning operation. Luckily, our API is flexible enough to accommodate this by allowing us to pass in separate cleaning functions for each city. If we call our getter method, getCleanedData, we should see correctly formatted data for each city.

001:// Calling the getCleanedData Method to inspect the first element of the data.
002:zurich.getCleanedData()[0];
003:// Example output showing parsed numbers and new LOCATION data field.
004:{
005:    "ROUTE_ID""305",
006:    "ROUTE_NAME""2",
007:    "STOP_ID""2251",
008:    "STOP_NAME_SHORT""DEP4",
009:    "STOP_NAME""Zürich, Depot 4 Elisabethenstr",
010:    "SCHEDULED""2012-10-01T11:00:00.000Z",
011:    "LONGITUDE"8.521632222222221,
012:    "LATITUDE"47.37333861111111,
013:    "DELAY"0.18,
014:    "LOCATION": [ 8.52163222222222147.37333861111111 ]
015:}

We now have our stop data loaded into the browser, cleaned, and easily accessible. The next step is to do the same for the geometric data. In our case, the geometric data comes in the geoJSON format for both the routes, and the stops. Therefore, we are able to use the same method to load these data files and simply pass in a callback that can be tailored to fit the needs of each file once we receive the response. To do this, we are going to define another method on the exports object called loadgeoJSON. This method will accept two arguments: _file and _callback. The _file argument will simply be the path to the data on our server. The _callback argument will be the custom callback that will executed asynchronously for each data manager once the data file is finished loading. We construct this method like so:

001:// Create a method to load the geoJSON file, and execute a custom callback on response.
002:exports.loadgeoJSON = function(_file, _callback) {
003:    // Load json file using d3.json.
004:    d3.json(_file, function (_err, _data) {
005:        // Execute the callback, passing in the data.
006:        _callback(_data);
007:    });
008:};

We can invoke this method by simply calling the loadgeoJSON method on our instantiated data managers for each city.

001:// Load the routes data and pass in the callback to be executed once the data loads.
002:zurichDataManager.loadgeoJSON('./data/zurich/routes_topo.json'function (data) {
003:    // Do something with the data via a callback.
004:});
005:// Load the stops data and pass in the callback to be executed once the data loads.
006:zurichDataManager.loadgeoJSON('./data/zurich/stops_geo.json'function (data) {
007:    // Do something with the data via a callback.
008:});

Summary

We now have our data in the browser, we have cleaned it, and we have created custom events to indicate when this whole process is complete! We can now begin the process of bringing this data to life, which we will do in the next chapter.

There has been error in communication with Booktype server. Not sure right now where is the problem.

You should refresh this page.