Share on facebook
Facebook
Share on twitter
Twitter

Data Visualization

The Grand Rapids Day of .NET on Saturday, October 23rd drew a sold out crowd of seasoned developers with a healthy mix of college students. I was pleased to accept the opportunity to share a few thoughts on data visualization and to learn from others’ experiences in this technical specialty. This is a review of the session.

Agenda for Data Visualization discussion
1. Range of data sources appropriate for a given analysis 2. Statistical graphics and cartographies are well-known visualizations, but not the only avenues
3. Scorecards provide high-level indications
4. Dashboards are useful for displaying status
5. Applications are providing features previously only available through programming
6. Ideas and references for further exploration
7. Distribution methods are many

The periodic table of visualization methods is a wonderful work from Ralph Lengler and Martin J. Eppler. They note dozens of representations in six categories of visualization: Data, Information, Concept, Strategy, Metaphor, and Compound. Understanding their approach is valuable in crafting useful ways to understand complex interactions within a pool of knowledge.

Focusing solely in the Data Visualization segment, the range of data sources is growing at rates that have never been seen. Phenomenal growth in data collection comes from debit and credit card transactions, phone activity, GPS use, marketing campaigns, medical procedures, and just-in-time processes for manufacturing, distribution, and retail sectors.

I used the five-hour drive from my home to Grand Rapids as an example of data sources and the interest held by many organizations for that data.

Stopping at a Speedway gas station and convenience store, I purchased gasoline at the pump using a debit card provided by JPM Chase, my banker. A validation step at the pump requested the zip code associated with the card statement’s mailing address. This data passed a Visa Debit Processing Services test. So, already, four companies are storing data:
1. Visa, to note that they processed my specific card for a gasoline purchase at a specific pump at this Speedway station
2. JPM Chase, in verifying the funds availability to at least $50
3. Speedway, in noting the zip code associated with the card and the volume of fuel purchased from a specific pump
4. Marathon Oil Company, Speedway’s parent company, for assessing fuel delivery and refinery schedules

After fueling, I entered the store and purchased a vitaminwater Zero drink and a pack of gum. Again, the debit process with Visa and Chase produced data for them, and sales transactions to Speedway and Glaceau. If jobbers or distributors were involved with the gum, they will have data provided from the transaction.

Driving along, I had my VerizonWireless Droid X in a cradle on the dash, and listed to several stations on TuneIn, an aggregator of streaming radio with a fine API for developers. I have a free account which tracks my preset channels. There were ads displayed as music played. I assume a content provider is given data on the start time and end time, plus GPS data, on their feed. Advertisers are at least given data on the number of times their ad is seen. I also placed a few calls and answered a couple of them. Verizon stored the location, DNIS and ANI (called number, from number), and start/end times for each call. Lots of data. More data flowed to TED.com and Akami and facebook and Linkedin and Google and Live and others as I reviewed sites along the way. Safety first, of course, so the idea that the car was moving at the time is only verifiable by Verizon data services analysts who would match transmitted bits to a succession of towers, eh?

Finally, I used a Discover card as I checked in to a Holiday Inn Express to claim a reservation placed through Priceline. Readers of the article to this point, and attendees at the session, can imagine the range of data provided to another handful of companies. Note that the data is not provided in real-time to most of the parties in this example, but most will want the data within hours, and certainly no more than 24 hours after the transaction.

So, what happens with the data? A new generation of analysts are providing visual insight, rather than studying ‘spreadmarts’ … or printing reports with dozens of pages of rows and columns … or providing simple statistical graphics like bar or line charts.

In discussion, the question of which key performance indicators are to be measured was raised. Only in meaningful discussion with domain experts can those be discovered. KPI Library was offered as a deep source of contributed KPIs from dozens of industries. ChartPorn.org was noted as a resource to inspire fresh statistical graphic designs.

Scorecards and dashboards are supported in SQL Server 2008 R2 Reporting Services. Rather than walk through demos, I suggested that interested parties can do so at their leisure by downloading the Express version of the database with the advanced services. Excellent walkthroughs are on Microsoft’s channel 9.

Exploring other products can be time well spent. Reference products that use SQL Server as a data source include Tableau and SAP Dashboard Designer, formerly Xcelsius.

The distribution of visualizations was questioned. Today, many internally-developed applications include embedded visualization. SharePoint 2010 opens the door to providing reports, scorecards, dashboards, and collaboration tools in one environment.

Finally, the need to interact with visualizations on mobile devices is nascient and surely open to amazing growth and adoption in the next few years.

Technology discussions with professional developers is an excellent way to learn and share. Several dozen of us did just that in a quick hour at Grand Rapids Day of .NET.

X