Big Data: It’s Not How Big It Is, It’s How You Use It

If you haven’t heard about Big Data this year, please tell me your secret. Tell me how you managed to avoid hearing about it. I want to know. Really. There are days when I want to be in a place like that. Desperately.

For some time now we’ve been hearing about device proliferation. A classic 10x market. People tell me that mainframes are numbered in the tens of thousands, minicomputers in the hundreds of thousands, PCs in the millions, smart mobile phones in the billions. Smart devices in the tens of billions.

These tens of billions of thingummybobs are getting busier and busier as they sense and observe and record everything and everyone around them, saluting the things that move and painting the ones that don’t.

In a perfect world, this would mean that there’s a lot of good data being produced, data that should prove useful to improve our lives. So that means there’s a market for software that helps us crunch the data into something useful, find the data we need, see the data in ways that can help us extract meaning and act on the meaning.

Even though this world is far from perfect, I’m glad to see that all that is happening. VCs have been plunging into Big Data for a while now. Data visualisation tools continue to get better, better and better. And search is beginning to do something about its verb-based future.

And yet……

And yet I have this sense of unease.

You can have lots and lots and lots of data, Big Data. You can have wonderful Big Data Crunchers, tools to help you do something with the data. You can have the world’s best visualisation and search tools.

But they all mean nothing unless you can act on what you see.

Innovation takes place through adoption into practice, and not just through invention and disruption. Until something is being used, nothing actually changes.

Big Data becomes useful when it leads to action.

That action takes place across a wide spectrum. At one end the actors are all machines. And we run the risks that Kevin Slavin alluded to in his TED talk, How Algorithms Shape the World.

At the other end all the actors are human. Shouting from the rooftops about overload, seeking to convert firehoses into capillaries. With limited success.

As Clay Shirky so memorably pointed out, there is no such thing as information overload, it’s all about filter failure. We need better filters, something I’ve been writing about for a while now.

Between the two extremes there’s a universe of space where we have a lot to learn. The more I read about the Air France tragedy the more I feel the need to look deeply into this, how human beings cope with decision making amidst such complexity. Just reading the IEEE blog coverage here and the Popular Mechanics coverage here gives us pause for thought.

My concerns get heightened when I realise major segments of the industry I’m part of can’t stop rubbing their hands with glee as they intone “big data, big big data, big on-premise servers, big on-premise storage, big processor licences, big profits…. Christmas is every day”.

That causes new problems related to the quality and reliability of the big data, as multiple sources take snapshots at different times. When I worked in capital markets, working on out-of-date data could cost millions in seconds flat; it became very very important to know how “live” the data was, and to keep checking that the data was real and up to the second (or even millisecond).

So we have the risk that there’s a lot of Big Wrong Data.

But you know something? These things can be solved. We can learn how to process the data more effectively, present it for visualisation more elegantly, check for its validity more ruthlessly, use the best software and services to do all this, use machines to filter and humans to curate. We can do it all.

And find ourselves in a situation where all we do is “wait faster”.

Because there’s one more thing we have to change. And that is this: the way we make decisions. And then do something.

Some years ago, I remember a story that was used to illustrate the importance of a particular book. [Sadly, I can’t remember the book any more, other than it may have had something to do with Price Waterhouse and may have had a yellow cover]. The story asked a simple question. Five frogs on a log [Aha, that was the title. Five frogs on a log!] Where was I? Five frogs on a log. Four decide to jump off. How many are left?

And the answer was… five. Because, as the authors point out, deciding isn’t doing.

The way we’ve allowed email and calendar to become part of our work lives, decision-making seems to be perennially poor in large, often still hierarchical, organisations. Email chains fragment and break and become divergent infinite loops. “Snooze, you lose” gets stated sometimes but rarely acted on. Getting the right people together has become harder and harder, process delays begin with scheduling time. Everything is at priority one in the scheduling process. If compromises are sought on the quorum front, they are temporary, evanescent. Infinite loop problems recur.

Big Data can have Big Effects.

That excites me.

Big Data can create Big Problems, sometimes with tragic consequences; we need to take extreme care.

Big Data can lead to Big Waste in on-premise activity, and we need to take care here as well.

Big Data can have a Big Positive Impact on industry, on education, on healthcare, on jobs (the subject of a separate post I will try and write this week) and even on government. People like Tim O’Reilly, Tim Berners-Lee, Nigel Shadbolt, Wendy Hall et all have worked hard to get everyone to understand what is possible here, and I am a convert.

But the cultural changes that have to take place in institutions should not be underestimated. Institutional immune systems can render all the big data useless by rising up to slow the processes down, make decisions harder to take, make action harder to initiate, make outcomes harder to achieve.

Big Data means Big Changes.

It’s Not How Big It Is, It’s How You Use It.

20 thoughts on “Big Data: It’s Not How Big It Is, It’s How You Use It”

  1. I am digging into some of this on my Forbes blog — recently with Splunk
    http://www.forbes.com/sites/tomgroenfeldt/2011/12/14/security-data-is-big-data-and-a-business-advantage/ which was also written up in the NYT on Christmas Day, not exactly the best time to get noticed
    http://www.nytimes.com/2011/12/26/technology/for-start-ups-sorting-the-data-cloud-is-the-next-big-thing.html?_r=1&scp=1&sq=splunk&st=cse
    I will be coming back to the topic in days ahead, including some commentary from Bain about when Big Data is really just Large Data (ah, a new cliche) and how to deal with it.
    http://blogs.forbes.com/tomgroenfeldt — comments much appreciated.

  2. The biggest issue that industry faces is making sure it’s asking the right questions in order to take great decisions. That means a laser like focus on knowing what is truly important, and the confidence to put to one side the environmental background noise.

  3. As technology continues to pervade our daily activities the amount of data available to corporations for analysis is exploding (“Big Data”). Yet as recent events have shown quite clearly, many of us fear organizations that seem to be monitoring every aspect of our lives.

    All the benefits from “Big Data” will be a pipe dream unless the Healthcare sector does a better job of allaying public fears about misuse of Healthcare data. We will become more receptive as we see the benefits directly impacting us in terms of lower premiums or improved service. Lawmakers and policy makers need to stay ahead of the curve ensuring that “Big Brother” stories about “Big Data” do not detract us from the benefits that can accrue if it is properly leveraged. The need for establishing a transparent system of checks and balances that protects individual rights while allowing the societal benefits of “Big Data” to accrue will be paramount.

    Read more on Healthcare and “Big Data”: http://deepaksethspeak.blogspot.com/2011/07/healthcare-and-big-data.html

  4. @deepak Big Data is about more than Healthcare. The checks and balances that are needed are as much to do with culture and ethos as they are to do with best practice or even legislation.

  5. Aren’t we “waiting faster” because we’re only looking for incremental improvements. 10 pct cost cuts, 20 pct y-o-y growth. When Big Data requires opposite thinking. 90 pct cost reductions, 200 pct y-o-y growth. Successfully so requires Mahatma’s tenets of affordability and sustainability: Gandhian Innovation emerges from Big Data?

  6. Hi JP,
    I am aware Big Data is more than Healthcare but I was confining my comments to that specific area where there has been some traction on organizations wanting to leverage it and pushback from some because of privacy/security concerns.

    You are right culture and ethos will have as much to do as legislation and best practice to allay the fears of some that organizations will use Big Data to become Big Brother(s).

  7. a very enjoyable read JP – thanks. I’ve been lookign at Big Data from the sensors and machine learning side of things and this gave me some new perspective. as you often do :)

  8. JP great read in both substance and tone – well done! Recently posted on role of Open Innovation through competitive communities wrt to Big Data – http://info.topcoder.com/blog/bid/50281/Why-Crowdsourcing-and-Open-Innovation-Can-Rule-Big-Data – Plz let me know your thoughts on the prospects of those two worlds working well together to create better filters, to harmonize seemingly disperse data sets to create value – would love your thoughts! Thx JP

  9. @deepak I was speaking at a Big Data conference some months ago when one of my heroes, John Perry Barlow, was on stage in the session before me. He made a point of talking about the social/conventional changes that will be needed, a redefinition of privacy, where the focus moves from regulating access to one of regulating usage and purpose of use.

Let me know what you think

This site uses Akismet to reduce spam. Learn how your comment data is processed.