Guest post from Eser Kandogan, IBM Research
Big Data is getting much of the attention these days. However, I think it should be "Big Questions" that should get all that much deserved attention.
We should stop being data-centric and start becoming question-centric. That is after all what people are trying to understand from data.
The question is: Do current data analysis and visualization tools support this?
I argue that they don't. What current tools support, for the most part, are querying, aggregation and visualization. The starting point is an existing dataset, and whatever answers it might provide.
And, it could very well be that the answer it provides doesn't really match the question you need answered. It might be that you need additional data. Users need support in finding the right data, massaging that data into the right form, integrating it with other data, finding people who know about the data and its underlying assumptions, and guiding users through several datasets to answer their questions.
Current business intelligence approaches worked well for common questions that were repeatedly asked on a regular basis, such as monitoring sales by geography by product line by year (yes, cubes!).
But now, with so much data on so many different aspects of the business, it is simply not possible to have analysis readily available on every possible business question. For example, how is a new marketing campaign going to affect brand sentiment. These types of questions are fairly open-ended data explorations.
As such, there is a big gap from a "question" to an "answer."
Between a question and an answer, there is a stack of technologies and people who know how to use each piece of technology and can translate data and questions from one domain or technology to another.
The question of a business executive, or a marketing manager, or a sales person is far removed from the data. We need to support anyone in the corporation to do their own analysis. In other words, we need "self-service data intelligence."
They need to be in the driver’s seat and drive the conversation with the data. You heard me. I think we should be able to have a conversation with data.
We really need to start thinking about Big Question architectures that support such conversations with data, where the user begins with a question.
Over the course of the conversation the user might pose more questions, smaller questions, alternative questions, and so on. They might want to combine answers from all of these questions to get to some sort of a bigger understanding.
In the course of that data conversation, the user might discover new data that can be integrated seamlessly into the system, along with all the semantics needed to drive the conversation. That conversation has to be collaborative so that others might see all the questions, the thought process behind the questions, and all of the steps it took to get the final answer. Others might also join in the conversation, offer help and suggestions.
The system should support the user by capturing and externalizing their thought processes as they are having conversations with the data. As a by-product of these conversations, people and datasets, there is tremendous meta-data collected that can be put into use by the system.
With the additional meta-data, the system can interpret users questions better within the right context, recommend further datasets, and suggest collaboration opportunities by finding people who had similar questions or looked at similar datasets in the past. The opportunities for leveraging that meta-data are endless.
How should such a Big Question architecture and conversational user experience on top of it work? What do we need to realize such a system? How should the user experience be like? Can we design a system that motivates users to ask more questions? Can we design systems that guide the conversation with data? Can we at least capture the thought process and socialize it?
I think these are the issues we need to begin researching. What do you think?
Continue exploring advanced visualization on IBM Many Eyes
Why stop the insight with this article? Visit IBM’s visualization hub, IBM Many Eyes and join more than 100,000 like-minded visualization enthusiasts, academia and professionals. The next version of Many Eyes will launch at the end of March with several new enhancements that continue to deliver on site’s heritage of advancing visualization, including:
· Comprehensive site redesign that includes an updated site layout and presentation. Plus, new affinity areas to find and navigate visualizations by industry or topic, such as finance, healthcare and risk.
· Addition of the Expert Eyes blog dedicated to helping you learn how to create effective and engaging visualizations that provide maximum insight and tell a story. IBM visualization luminaries and IBM Researchers from the Center for Advanced Visualization will contribute regular thought leadership and perspectives.
· New visualization options, including a heatmap and view-in-context visualization built on IBM’s Rapidly Adaptive Visualization Engine (RAVE). RAVE, a declarative language based on the IBM patented ‘Grammar-of-Graphics’ approach, provides an intuitive way to create a visualization by describing what the visualization should look like not how. With Many Eyes, RAVE does the work behind the scene and you create your visualization in three easy steps.
Discover the newest version of Many Eyes beginning March 25 by visiting ibm.com/manyeyes.
Read a retrospective of how Many Eyes was developed and where it’s going.
Eser Kandogan is a research staff member at IBM Research – Almaden. His interests are intelligent information interaction and data visualization. He holds a Ph.D. degree from University of Maryland, College Park, Computer Science Department. He is the author of Taming Information Technology, Oxford University Press.