Pictures Make Sense of Big Data
Most people have trouble recalling strings of numbers that are longer than their phone numbers. So how do we begin to comprehend a hundred rows of data, let alone a thousand or a million or a billion rows?
That’s the dilemma so many companies face, thanks to technology advances that make it easier to routinely collect enormous amounts of data.
The answer is pictures.
Humans are fundamentally different from computers—we’re wired to comprehend shapes, patterns and colors. So technology companies are using data visualization to help companies turn large sets of data into pictures that lead people intuitively to the information that is most important to them.
That can mean something as familiar as a color-coded map, only with lots of interactive features. Or it can mean something unfamiliar to most people, like seemingly amorphous shapes that on closer inspection quickly yield insights into the data they portray.
One picture that’s still considered effective was published in 1869 by the Frenchman Charles Minard (http://en.wikipedia.org/wiki/File:Minard.png). It shows the casualties suffered by the French army during Napoleon’s disastrous invasion of Russia in 1812 and 1813.
By using two colored lines—one for the army’s advance and one for its retreat—and varying their thickness as the army’s position and its numbers changed, Mr. Minard showed how many men advanced into Russia and how many returned at various locations. A third graph shows the freezing winter temperatures the men encountered as they retreated to France.
Today, Mr. Minard’s picture wouldn’t be static. We would be able to dig into it, clicking to find out how many men died on a given day and perhaps correlate those numbers with other data, such as the amount of food available or the types of weapons used. We would have more ways to visualize the data—we could lay it against a map—and we could monitor the reaction to it on Twitter.
Below are some examples of what technology companies are doing today to show us big data in pictures.
This picture combines social-media data with information from a retailer’s billing systems to evaluate a marketing campaign. It compares public response to the campaign with the revenue the campaign generated. Horizontal bar charts are an easy way to compare two metrics, such as positive and negative sentiment, says GoodData Corp. Vice President Hubert Palan. (Since this is a U.S. retailer, negative sentiment is in red. If it were an Asian retailer, red would be considered positive.) The bubble chart combines three metrics—reach, engagement and return on investment. The verdict: Social-media reaction was good, but return on investment wasn’t, so this campaign didn’t perform as well as others.
The shapes that Ayasdi Inc. data displays take are automatically generated by the company’s software, which relies on a branch of mathematics called topology. This picture shows clusters of credit-card transactions; the red dots indicate fraud. By clicking on the areas that are red, users can get more information on how those frauds were perpetrated. That can help them develop ways to prevent further incidents, say by adding new rules to their transaction systems. Users can choose the colors, but the default color scale runs from blue to red—cool to hot—says Ayasdi Vice President Jeff Yoshimura.
This series of pictures tells the story of a company’s profitability through commonly understood shapes and graphs. Numbers about employees, for instance, are represented by human figures. Numbers about products break down into shopping bags—the bigger the bag, the more profitable the product. Tidemark runs on iPads, and since its data is in the cloud the story can be shared and continually refreshed, “much like Facebook and LinkedIn on the consumer side,” says Tidemark Systems Inc. founder and Chief Executive Christian Gheorghe.
By DEBORAH GAGE Read more