Why Big Data can lead to Big Mistakes – and how to avoid them
MARTIN KEEN 1200007VU3 MKEEN@US.IBM.COM | | Tags:  martin_keen data ibmredbooks nate silver analytics redbooks
0 Comments | 7,495 Visits
The more information you have available to you, the better you'll be able to make informed decisions, spot trends, and make predictions. Right? Or does adding more information just mean more complexity – false positives and illusions?
What's for sure is there has never been more data available to us. 90% of the data in the world today, from all human existence, was created in the last two years. And it's growing. Today alone, 2.5 quintillion bytes (2.5 billion gigabytes) of data will be generated. Followed by another 2.5 quintillion bytes tomorrow. And the next day.
This is great news if more information means better informed decisions – terrifying news if it means the opposite.
Last November at IBM's Big Data conference Information On Demand, an unassuming statistician took the stage. His name was Nate Silver, and he was beginning to make a name for himself as an author. Later that month, he was making headlines the world over, having correctly called the outcome of the US presidential election in all 50 states.
Nate Silver's book The Signal and the Noise: Why So Many Predictions Fail – But Some Don't is a book about Big Data. It's filled with examples of analyzing data and mistakenly identifying what is thought to be signal (a meaningful pattern in data) with noise (random fluctuations). The examples include the ridiculous (the stock market gains an average of 14% when an NFC team wins the Super Bowl, and drops an average of 10% when an AFC team wins) and the worrisome (economists correctly predicted only 2 of the 60 recessions around the world in the 1990s).
Nate Silver makes the point that having more information in itself does nothing to improve our understanding of it. In fact, it can make matters worse. As the amount of available information increases, so do the number of hypotheses to understand that data.
Thankfully help is at hand – analytics.
In the IBM Redbooks Point-of-View publication Exploring the Potential of IBM Smarter Analytics Solutions, Jean Francois Puget and Baruch Schieber define how analytics applies to the process of using data to derive insights to make better decisions. Analytics are behind Nate Silver's impressive presidential election predictions, and behind the computer models that alerted meteorologists to the seriousness of super storm Sandy.
By applying analytics, you can assimilate, digest, and act on data. The Point-of-View publication cites plenty of examples of analytics at work – from helping Best Buy improve their advertising effectiveness, to Netherlands Railways optimizing their train schedule. Analytics can be used in simple form to present data in concise reports which aid decision making, through to its most complex form using analytics to collect, report, and ingest data to predict future trends and events.
Analytics is an area IBM has invested in for years, with over 10,000 technical analytics professional working for IBM, and over 9,000 consultants delivering IBM analytics solutions. IBM Smarter Analytics solutions connect people with trusted information so they can make real-time decisions and act with confidence in delivering better business outcomes. But Jean Francois Puget and Baruch Schieber explain it so much better: Exploring the Potential of IBM Smarter Analytics Solutions.
Martin Keen is an IBM Redbooks Project Leader. He works with technical experts to create books, guides, blogs, and videos. Follow Martin on Twitter at @MartinRTP.