Guest post from Marcus Hearne, Business Unit Executive, IBM Business Analytics
Follow Marcus on Twitter @marcushearne
As a person with an analytical bent, I sometimes find myself frustrated with the ongoing industry debate on the definition and worth of big data.
Whether you think it’s simply repackaging or excessive hype, it’s a name for something quite real that’s full of enormous potential, and to ignore it means you will both miss these gains and fail to understand its implications to you personally. Call it what you will, but big data is as personal as it gets when coupled with analytics.
Right now approximately a billion tweets are created every few days. Although, when I send a tweet I have this sense of emptying a glass of water into Lake Michigan; and, maybe one of my pithy tweets is like a glass of dyed water remaining distinct for awhile.
My thoughts are quickly dissolved into the greater mass, only to be unearthed when my daughters become teens and are looking for ammunition to move curfew past 10 p.m.
But these cast off quips are the tip of this big data iceberg. Much to my mild disapproval, a lot of my demographic information is already in the public domain. It comes from any number of benign sources like the digital white pages or my mother posting and tagging pictures of me on her Facebook page; and, some not so benign, such as a background check for a nominal fee.
Even that’s small scale compared to the data I constantly create simply moving through my day. Don’t be fooled – you do it too as surely as you create carbon dioxide with every breath you take.
For example, when I wake up I check my calendar on my phone for any changes. This activity is immediately captured and recorded as the time my phone was awakened, the fact I opened the calendar, any particular events I then opened, and even my location.
More data spills out when my I swipe my card at the train station, stream music and browse web pages on my phone, buy coffee with an app, and swipe my ID to get into the building. Just getting to my desk creates enough data to create a bland – yet informative – narrative of what I’ve done and where I’ve been.
The result is a never ending stream of data coming at organizations faster and faster, and in an ever expanding variety of forms. Not many organizations can connect the sentiment of my tweet regarding the efficient barista to an understanding of my coffee purchasing patterns when it’s struggling to simply keep up with the rewards program and offers.
So what’s a 20th century organization to do in a 21st century big data world? Analytics.
Big data without analytics makes no sense. Simply capturing more and more data about me, and trying to make sense of it without exploring new ways to analyze it is merely an added cost without benefit.
Sure you can get a more detailed picture of my activity, and you can align me to some complex and mildly interesting market segment based on an aggregation of additional factors, but you’re never going to achieve real value. It’s not who I am, it’s what I’m going to do and why. That’s the diamond within this pile of digital carbon.
Predictive analytics boils down to identifying an outcome or behavior that’s important, and then mathematically determining all of the variables that can be an influence on that outcome or behavior, and putting these together in the form of a predictive model.
From there, as new values are created for those variables and entered into the model, a prediction can be made for the targeted outcome. A very simple example is a model that uses one variable to predict if I will buy a large coffee, and that being the time I wake up. The predictive model would say, “Marcus is 95 percent likely to buy a large size if he’s awake at or before 5:30 a.m.; 70 percent likely between 5:30 a.m. and 6:30 a.m.; and, 12 percent likely if after 6:30 a.m.” As such, acquire the time I wake up and you can predict my coffee buying behavior.
Now nothing’s quite that simple.
There are probably a bunch of other variables that can add or subtract a few points to the probability – the weather, how long I wait for a train, the news I choose to read, if I’ll be going to the airport, a tweet stating just how tired I am even though I slept until 7 a.m., and so on.
The point is that big data represents an opportunity for predictions to become much more accurate. It’s not a stretch to say that a predictive model could determine when I was going to send a tweet and what the sentiment was going to be.
So, as I sit here at the center of my own ever expanding plume of data and its unknown stewards, it occurs to me that I have to be judicious about the information that unwinds behind me as I move throughout the day. I’m not talking about moving to Montana and building a collection of stylish tinfoil hats, I mean simply being aware of it and its permanence.
I want a system intelligent enough to recommend the clothes I will like when I shop, or to fast track me through an insurance claim rather than asking every… single… question… twice.
But as a father of two daughters I’m keenly aware of a new lesson I have to bestow upon them: the awareness of the data they create. My daughters should be able to quantify and qualify everything from their opinions to their vacation pictures, and therein correctly judge how freely they should be shared (if at all).
Basically, I want them to grow up and demand to be recompensed appropriately for their data. After all, is the service Facebook provides worth so much intimate detail?
This is the natural evolution of an understanding of big data, and how analytics can swim through the storm of it and find you, dragging every single byte that you’ve left in your wake, and deliver incredible (and sometimes unwelcome) revelations.
There’s a huge upside to all of this regarding your health, happiness, and productivity, but we need to understand that data is an asset, and big data has made it a very valuable asset indeed, and both sides should benefit in kind.
For more information:
**Register for IBM’s upcoming Business Analytics Virtual Launch (Tuesday, June 11) and see how our new solutions empower organizations in the era of Big Data