Data Vis 101 with Lynn Cherny

The DST4L class of November 4 featured the return of instructor Lyn Cherny, whom we last saw at the end of August when she taught a class on data manipulation and graphing in Excel.

Lynn started her lecture by quoting a reporter she was talking to earlier this year, who said, “Data visuals are the sports page of ‘big data.’”

In a way, it’s true.  Everybody likes data visualizations.  The combination of eye-catching graphics and enough numbers to tell a story quickly and clearly is hard to resist.  And if they’re interactive, even better.

This class featured much more lecture and class discussion than previous classes, and no programming at all.  Lynn talked about the principles of data visualizations—about what makes a good visualization and what doesn’t.  We were introduced to Anscombe’s Quartet and Hans Rosling’s bubble race.  Lynn spent a lot of time talking about the importance of choosing the right kind of graph.  All have strengths and weaknesses, and while there is usually not only one right answer sometimes there is a wrong one.

Good visualizations take advantage of the human perceptual system—our amazing ability to quickly pick out differences in pattern, size, color, shading, and position.  Our perception has limitations, though, and good designers know not to rely too much on pure color, color saturation, shading, area, and volume.  Bar charts are the easiest for people to read and interpret, but a world with only bar charts has been called a world without joy.

One of the examples Lynn used to show data presented in a less-than-clear manner is the infamous Nightingale Rose.

nightingale_rose

We didn’t spend too much time on this graphic in class except to compare the same data presented as a more traditional line graph and muse about how much clearer it was than the circular diagrams above.  After class, though, there was a veritable email after-party on this topic.  Lynn emailed the class to let us know that the line chart remake had a data error.  The chart maker had not gone back to the original data but had instead relied on someone else’s remake for the numbers.  Tom Morris chimed in with more links to articles about this type of graph (also known as a coxcomb) as well as a link to a fascinating article about Nightingale’s possible reason for choosing this type of graph in the first place.  It was an excellent lesson in how libraries and librarians can add value to data vis by investigating provenance and data quality

Overall, an engaging class on a subject brand new to most of us.

Lynn was generally dismissive of pie charts, but I can say that for accuracy in this one case, at least, a pie chart is unequalled.

 pie

Books Lynn recommended

Kirk, Andy.  Data visualization : a successful design process.  Birmingham, UK : Packt Pub., 2012. http://hollis.harvard.edu/?itemid=|library/m/aleph|013621977

Cairo, Alberto.  The functional art : an introduction to information graphics and visualization.  Berkeley, CA : New Riders, c2013.   http://hollis.harvard.edu/?itemid=|library/m/aleph|013621100

Iliinsky, Noah and Julie Steele.  Designing data visualizations.  Sebastopol, Calif. : O’Reilly, c2011.   http://hollis.harvard.edu/?itemid=|library/m/aleph|013081377

Yau, Nathan.  Data points : visualization that means something.  Indianapolis, IN : John Wiley & Sons, Inc., [2013]    http://hollis.harvard.edu/?itemid=|library/m/aleph|013804115

Robbins, Naomi.  Creating more effective graphs.  Wayne, N.J. : Chart House, c2013.

Class notes in PDF

Leave a Reply