Guest from Jason Tavoularis, Product Manager, IBM Business Analytics
With so many vendors promoting their derivatives of Apache Hadoop open-source software, you've probably heard the word "Hadoop" before. Essentially, Hadoop enables the distributed processing of large data sets across clusters of servers.
It also has great applicability with business analytics. While not every business problem requires a distributed environment like Hadoop, it is creating some intriguing innovation and discussion. Here are a few reasons why I'm excited:
It scales linearly. This means you can double the servers in the cluster to halve data processing time. It's also true for storage capacity. The highly-fault tolerant Hadoop Distributed File System (HDFS) allows you to share the load of containing the source data amongst as many servers as you need.
This gives organizations the opportunity to make informed decisions on matters where the associated data was traditionally unusable due to its volume. As your data volumes grow, simply add more servers.
Cloud deployments make predicting how many servers you will need for a new project less stressful. No need for the long-term commitment associated with capital expenditures. You can host your own servers once you're realizing the value in a production environment.
Unstructured data. Don't get me wrong, order and organization are good things. So is structure. But it's nice that Hadoop source data doesn't need to be constrained to columns and rows. Social media (including this blog) is fair game.
At Information On Demand last October, our partners at the CSC booth had a demo that was really cool. You sent them an email with a particular frequent flyer number written in the subject and they would refresh an IBM Cognos business intelligence report with purchasing history for that frequent flyer number from an IBM Netezza database. The report would now also be displaying the contents of the email you just sent! The email was being parsed by IBM InfoSphere BigInsights (an enterprise Hadoop system) and then passed on to IBM Cognos.
New collaborations. It pleases me to see extremely smart people from all around the world working together to solve problems. This past year I've witnessed amazing teamwork between some of the original pioneers of business intelligence software in the IBM Lab in Ottawa, Canada, and Hadoop experts amongst our BigInsights colleagues in Silicon Valley.
I look forward to sharing the innovations with you on the upcoming IBM TechTalk (Tuesday, Jan. 22 from 11:00 a.m - noon ET). We will be discussing how to combine the big data processing capabilities of IBM InfoSphere BigInsights with the self-service business intelligence reporting of IBM Cognos. I hope you can join us.
For more information:
· Read more about the IBM Cognos family of business intelligence software
· Learn how to consume IBM InfoSphere BigInsights data through IBM Cognos v10.2