Scaling time and space

Almost all data visualizations are about expressing change or difference over one or more dimensions. The distinguishing factor of each visualization is what those dimensions are, and the units used. Are they a literal conversion of world space to display space? That's a map. Add another dimension, via color space, and now you have a cloropleth. Change the units to electoral votes instead of geographic area, and now you have a tile grid map.

A universal feature of data visualizations, therefore, is figuring out the scaling function for your data: how do we transform data into a given dimension? Many frameworks will do this for you, but we're going to do it from scratch here. And we're going to do so in a very particular way: we're going to "normalize" our data range, creating a function that takes a value and returns a number in the range from 0 to 1, with 0 being "the low end of our data set" and 1 being "the highest value in our data set."

The formula for a normalized linear scale is (value - minimum) / (maximum - minimum). For example, let's imagine that we're charting the murder rate for a given city over time. At its highest, the murder rate was 12.5 homicides per 100,000 people, and the lowest point was 3.6. If we set those as the bounds for our scale, the function to give us our scaled value is:

var scaleMurders = value => (value - 3.6) / (12.5 - 3.6);

Essentially, we're writing a function to answer the question "given a number line stretching from the minimum to the maximum, where do we find this value?" Of course, writing individual scaling functions by hand can be frustrating, especially if our visualization has more than one axis. We might be better off writing a factory, which takes in a high and low value and returns a custom scaling function:

var scaleFactory = function(low, high) {
  return function(value) {
    return (value - low) / (high - low);
  };
};

var scaleMurders = scaleFactory(3.6, 12.5);
scaleMurders(9.2); //0.6292...

Why do we use 0 and 1 as the limits of our range? Basically, the math is convenient. Once your value is normalized, you can convert it to another range via the formula value * range + minimum. You might want to graph a value on a 600px wide canvas, but reserve the left 20px for drawing the axis. No problem: multiply your normalized value by 580 and add 20 to get the x coordinate in pixels. Normalization, therefore, creates useful intermediate values for going between "data space" and "screen space."

A brief digression into the Math object

But how do we figure out the min and the max of the data in the first place? One complication of data visualizations in the browser is that they're often fed with an array of objects—say, a series of rows from a spreadsheet, or a JSON dump from a database. We may also want to change the property that we use as the basis for our chart based on user input, like changing the display from murder rate to robberies. So here's a trick for getting the minimum and maximum quickly.

First, let's use the map() function to convert an array of objects into an array of values. A map is just a transformation from one list to another, based on a function you pass in. The following function takes an array of objects with many properties and converts it to just the values for the murder property:

//cityData is an array of row objects
var values = cityData.map(d => d.murder);

If we need to change the field that we're using for our chart values, we can change the code a little bit so that it's a reusable function, getting a property via the array operator ([]) instead of a dot operator:

var getValues = function(mode) {
  return cityData.map(d => d[mode]);
};

//get murder data
var values = getValues("murder");

//get robbery data instead
values = getValues("robbery");

Now the clever part: instead of looping through those values and figuring out the minimum and maximum by hand, we'll use JavaScript's built-in math functions. Math.min() and Math.max() accept an arbitrary number of arguments, and will then tell you the lowest and highest value from those arguments. The spread operator, which we used in the last chapter to convert a list-like object into an array, also works here to distribute our values across the inputs of a function.

var min = Math.min(...values);
var max = Math.max(...values);

Sparklines in a flash

A sparkline is a tiny graphic that's intended to be displayed inline, and illustrate a trend. You might use it in a business story, for example, to visualize the stock price of a company next to its ticker symbol. Let's use our scaling function to dynamically create sparklines for a story. First, we'll start with some quarterly data for each company:

var stocks = {
  ACME: [20.4, 30.2, 25.5, 14.9, 7.2],
  BRDX: [14.1, 15.1, 16.3, 18.0, 17.7]
};

And in our markup, we'll create placeholders for our script to find, using spans with a "ticker" class:

<p>
  Meanwhile, Acme Products (<span class="ticker">ACME</span>) continues its
  plunge, due mostly to the loss of the valuable "coyote" market.
</p>

We'll also need to style our sparkline blocks, so that they're visible:

.sparkline {
  display: inline-block;
}

.sparkline-block {
  display: inline-block;
  background: orange;
  border-top: 1px solid darkorange;
  width: 4px;
}

Finally, we'll look through the page (using our deconstructed jQuery) for our ticker spans, create a scaling function, and then add a series of sparkline blocks to the span. The height of each block will be the normalized value multiplied by the maximum height. In this case, we're going to be using 2ex— an "ex" is the height of an "x" character, so two of them is a decent stand-in for the total height of a line of text.

$(".ticker").forEach(function(span) {
  // get the ticker symbol
  var symbol = span.innerHTML.trim();
  var data = stocks[symbol];
  // if we don't have data, exit early
  if (!data) return;
  // create our scaling function
  var min = Math.min(...data);
  var max = Math.max(...data);
  var scale = value => (value - min) / (max - min);
  //now create our sparkline blocks
  var sparkHTML = " <div class=sparkline>";
  data.forEach(function(d) {
    sparkHTML += `<div class="sparkline-block" style="height: ${scale(d) * 2}ex"></div>`;
  });
  sparkHTML += "</div>";
  span.innerHTML += sparkHTML;
});

Here it is in action:

Meanwhile, Acme Products (ACME) continues its plunge, due mostly to the loss of the valuable "coyote" market.

You'll see scaling functions like this used again and again throughout the book, so you'll have plenty of practice using them. And in space, they're fairly intuitive, since they're basically just a position on a number line between two points. But what about using them to express change over time?

Tweens and easing

Let's say that we wanted to animate between two values over time. Ideally, for visual changes, we should do that with a CSS transition, which will let the browser take care of it for maximum performance. But some values can't be transitioned in CSS, or we might be animating something in a canvas tag (which means there's no CSS to transition). In those cases, we'll write a "tween" to handle the animation (as in "between"). And at its core, a tween is just a scaling function that maps time to space.

Let's demonstrate this with a simple canvas animation. When you click one of the buttons below, it'll animate a circle moving from left to right, either in a linear fashion or using a "swing" that accelerates and decelerates smoothly.

To create this animation, we'll first define an outer animate function that gets called when you click a button, and records the current timestamp. We're going to get the time from the performance object, which gives us the number of milliseconds since the page was loaded. This is the same timestamp that's used by requestAnimationFrame, so we don't have to worry about any conversion mismatch. We'll also store a "mode" to figure out which button you clicked.

var animate = function() {
  var start = performance.now();
  var mode = this.dataset.mode;
}

Now, inside of animate, we'll add a "draw" function, which will be called during every frame of the animation. We're going to schedule it using the requestAnimationFrame function, so it'll be passed the current time as its argument. We can compare that time to the start time we recorded when animate was first called, to get the number of elapsed milliseconds:

var animate = function() {
  var start = performance.now();
  var mode = this.dataset.mode;
  
  var draw = function(t) {
    var elapsed = t - start;
    // clamp to 1 as the max value
    var dt = Math.min(elapsed / 1000, 1);
    /* ... */
  };
};

This animation will run for 1000 milliseconds, and we want to know how far through that we are now (remember, this function will be called over and over again throughout the animation). So our scaling equation above is number of milliseconds / duration, or elapsed / 1000. This is basically the same as our normalized scale, but we don't have to subtract a minimum from either side for this version of the equation, because unless you have a time machine available, the minimum is always 0ms.

If the scaled value goes over 1, we'll "clamp" the value to that maximum—because we're not in control of the browser's framerate, it's almost certain that our function won't be called exactly at 1000 milliseconds, and if it gets called late we don't want the animation to overshoot its end point.

Now we'll continue the function by converting the scaled time, dt, into an x coordinate that we can use to draw a circle. Just as with a chart, we can scale our position by multiplying the normalized dt by the maximum value in the animation (canvas.width).

var draw = function(t) {
  var elapsed = t - start;
  // clamp to 1 as the max value
  var dt = Math.min(elapsed / 1000, 1);
  // add a little optional easing
  if (mode == "swing") dt = swing(dt);
  // convert dt to position
  var cx = dt * canvas.width;
  var cy = canvas.height / 2;
  //draw our circle
  context.clearRect(0, 0, canvas.width, canvas.height);
  context.beginPath();
  context.arc(cx, cy, cy, 0, Math.PI * 2);
  context.fill();
  // schedule the next frame if we're not done
  if (dt < 1) {
    requestAnimationFrame(draw);
  }
};

At the end of the function, we check our scaled time value. Remember, we clamped the time value, so even if the browser calls draw() a little late, it'll still be limited to 1 as a max value. If it's less than one, we're still in the middle of the animation, and we'll schedule our drawing function for another tick using requestAnimationFrame.

Did you notice the "swing" check in the middle there? That's a reference to an easing function, which tweaks our normalized value into an acceleration curve. Without it, dt moves linearly from 0 to 1, which tends to feel "robotic" in motion. The swing function is taken from jQuery, and uses cosine to make our circle move slower at the beginning and end.

var swing = p => 0.5 - Math.cos(p * Math.PI) / 2;

There are a lot of easing functions, and each of them has a different feel. Many of them are also defined in CSS, as options for the "transition" property. When doing animations, you should almost always apply an easing function to give movements a feeling of physical weight. The MDN page on transition-timing-function has more details, and demonstrations so you can see what each easing value looks like in CSS.

All together, our animate function looks like this:

var animate = function() {
  var start = performance.now();
  var mode = this.dataset.mode;

  var draw = function(t) {
    var elapsed = t - start;
    // clamp to 1 as the max value
    var dt = Math.min(elapsed / 1000, 1);
    // add a little optional easing
    if (mode == "swing") dt = swing(dt);
    // convert dt to position
    var cx = dt * canvas.width;
    var cy = canvas.height / 2;
    //draw our circle
    context.clearRect(0, 0, canvas.width, canvas.height);
    context.beginPath();
    context.arc(cx, cy, cy, 0, Math.PI * 2);
    context.fill();
    // schedule the next frame if we're not done
    if (dt < 1) {
      requestAnimationFrame(draw);
    }
  };

  requestAnimationFrame(draw);
}

Clicking a button calls this function, and triggers the animation. In this case, it just moves a circle around. But it should be easy to imagine using a similar function to animate points between two positions on a scatterplot, or to animate the heights of a bar chart in response to a filter. I've used it to tween the position of a camera for a 3D visualization. You can even use a similar technique to transition the RGB components of two colors, creating a color fade on a canvas. The possibilities are endless—until we normalize them, at least.