Exploratory or Presentation? Visualization serves two masters
Delaney Turner 270003RQ8K Delaney.Turner@ca.ibm.com | | Tags:  information-insights ibmsoftware
0 Comments | 8,739 Visits
The following is the first of a new six-part series on Advanced Data Visualization. Over the next three months, IBM visualization experts will explore new and emerging visual techniques and the underlying technologies you can deploy to better understand your data to transform insights into better business outcomes.
Graham Wills is the lead architect for IBM’s visualization engine. He has two decades experience in research and implementation of visualization systems in areas including statistical models, geo- and temporal- visualization, large-scale networks and coordinated views. He has published widely in the field and his recent book, Visualizing Time is currently available on Amazon.
Visualization is an enabling technology – when we create a set of charts to show some data, our goal is not to create a pretty chart for its own sake, but rather to reveal something in the data.
When we look at beautiful hand-drawn pictures of data, carefully composed by talented individuals, we are drawn to the artistic side. In some ways, those charts are discouraging; their artistic elegance implies that the creation of good visualizations is not an option for most of us.
There are books that provide rules and advice on how to draw graphs. Some give general advice, suggesting that such and such is good, but this other is bad. Others give specific advice such as requiring all charts to have a title, or all axes to go to zero, but these are often tied to specific visualizations, and so are not general enough to qualify as scientific principles. So this leads to a question – what makes a good visualization? Is it the quality of the presentation, or is it the degree to which it allows people to explore and understand the data?
Over the years, this split has led people to label charts and put them into categories: Tables and Pie charts are presentation charts; anyone wanting to explore their data should not use them. Scatterplots are exploratory charts; hide them from anyone who isn’t a data geek. Different tools evolved that concentrated on one of these aspects; presentation graphics packages that emphasized a vast amount of customization on a small subset of simple charts; and exploratory graphics packages that allowed very little customization, but often had a wide and eclectic set of charts.
Exploratory/presentation split no longer useful
Now, in contrast, there is a much stronger emphasis on chart building tools, whether based on programming libraries or language descriptions (such as IBM’s Grammar of Graphics-based approach). My strong feeling is that the exploratory/presentation split is no longer a useful one; visualization tools can serve both presentation and exploratory goals. In fact, I would argue that they must do so.
Consider the figure below. This shows box office take for movies in 2008. It has presentation aspects – it highlights major effects, looks attractive and is effective as a static image. But it also facilitates exploratory tasks. We can see not only the big movies and when they are released (summer and Thanksgiving effects are very obvious), but we can also browse though the shapes and see more subtle details, such as how “The Dark Knight” hit its peak rapidly, whereas “Juno” had much longer legs. We could switch color from being simply a way to differentiate moves to instead encoding movie type – action, drama comedy etc. (Click here for a bigger image)
It would also be interesting to use a small multiples or time animation approach here and show the same chart for several years (I suspect 2012 will look very similar – swap “The Avengers” in for “Iron Man” and “The Dark King Rises” for “The Dark Knight”). We could explore different aspects of movie releases with simple enhancements to this chart.
This chart is based on a relatively “tech-y” statistical technique, kernel density estimation, composed with a stacking operation similar to the one that makes stacked bar and area charts. It is then wrapped around into a circle using a polar transformation that was originally developed for the canonical presentation graphic, the much-maligned pie chart.
This concept of composability is central to bridging the divide between presentation and exploratory graphics. A feature that might be thought of as presentation only can be used as an exploratory tool. And limiting your graphics to a small set of presentation charts is a recipe for failure in a world where data is not simply growing in volume, but variety.
Even the most staid and low-tech visualizations can benefit from composing in exploratory aspects. The table shown below gives the percentage of canceled flights by day of year over a period of twenty years (so the top-left cell lets us know that 2.1% of flights on January 1 were cancelled).
Add shading to tables to aid exploration
A table is a good, simple way to represent the data. It is helpful to see the numbers so we can see that July 4th is great day to fly, but the numbers themselves do not help us explore and find patterns. So this table has been enhanced with exploratory features; we shade the cells by the cell data (which makes it easier to see the strong difference between November and December, for example) and, because upper-end outliers are of particular interest, we highlight those cells that are statistically significant with a border.
I have deliberately kept this a black-and-white chart rather than use color (which is more effective) to show that even in a very constrained situation such as a printed report, visualization can successfully merge exploratory and presentation techniques and improve the ability of people to do what they do best – see important features of their data and take action on it.
That is the heart of what visualization is for. A beautiful chart can be appreciated, but when we have beautiful charts that allow people to see their data and take action based on it, we have merged art and science to provide truly useful visualizations. Allowing exploratory and presentation features to be composed is a key feature that makes this possible; it is the future of visualization.
Continue exploring visual analytics on IBM Many Eyes
Visit IBM’s hub of visual analytics, IBM Many Eyes, and join over 100,000 like-mined visualization enthusiasts, academia and professionals. The Many Eyes web community democratizes data visualization by providing a simple three step process to create and interact with a visualization using your data set. Then share or embed your visualization across the web or your social network.