Guest post from George Makovic, Senior Development Manager, IBM Business Analytics
If you’re like most people who’ve taken a college level course in statistics, you viewed it as something to be endured rather than enjoyed. There’s no question that statistical analysis and data mining – I’ll refer henceforth as predictive analytics – are daunting disciplines and that’s the primary reason that the demand for talented practitioners outstrips supply in today’s market place.
Make no mistake: businesses acknowledge the value that analytics can bring to their organizations. A recent study by McAfee and Brynjolfsson of MIT revealed that productivity and profitability of companies who’ve adapted predictive analytics are 5 to 6 percent higher than their peers and competitors.
Unfortunately, the number of people in a given organization who are capable of doing these complex analyses is usually small and this creates bottlenecks and limits progress.
How can business users – talented and confident in their respective domains, but lacking the requisite knowledge and skills – join this “analytic discussion” with their colleagues who are more versed in these techniques?
Where to start?
Sometimes just getting started is half the battle. “Analyst’s block” is every bit as real as “writer’s block.” There are recognized standards that can provide some guidance, such as Cross-Industry Standard Process for Data Mining (CRISP-DM). However, for the novice even those proven steps can lead to frustration.
For example, gaining a basic understanding of your data’s quality is a good place to start. Are there outliers in the data and if so how should they be dealt with? Which combinations of variables have strong relationships? Which variables are “model worthy” and if so which method is appropriate? Is a decision tree more appropriate than a linear regression for a given set of variables?
Expert practitioners are well-served by products such as IBM SPSS Statistics and IBM SPSS Modeler, but the uninitiated can get into trouble very quickly if they don’t know what to do next. Adding big data to the equation only exacerbates this problem.
Open to interpretation
Suppose you were able to produce a predictive model or perhaps one was produced for you by an expert. Would you know how to interpret the results? Which is more important significance or effect strength? R2 or r?
Knowing what to look for in a traditional statistical output table is a real skill that requires training. Left to their own devices, business users may miss a key finding or focus on the wrong result. Many predictive analytics users will tell you that one of the key challenges in their jobs is making statistical results understandable to business users.
Making things consumable
Cleary what is needed is some assistance to bridge the gap between business knowledge and deep statistical analysis. In short, business users need an advocate.
Imagine if you had your own statistical Sherpa to help you navigate the mountains of big data, or maybe just the phone number of the teaching assistant in that Statistics 101 class you took ten years ago?
IBM is exploring ways to intelligently automate not just the generation of advanced analytics results, but also their interpretation. Further, using state-of-the-art visualization technology makes reviewing these results a much more interactive experience.
Business users have intimate knowledge of the problems their organizations face: they just need some assistance viewing the problem quantitatively.
Therefore, tune into our upcoming IBM Business Analytics Virtual Launch (Tuesday, June 11) and
see how IBM’s new analytics solution will make big data more accessible and serve as a catalyst to
propagate the use of advanced analytics throughout the enterprise.