Big data analyticsBig Data is THE biggest buzzword around at the moment, and I guess it makes sense to start my new ‘The Big Data Guru’ column with a post that goes back to basics and establishes what big data really is, what is isn’t and why it matters to everyone.

One thing is certain: big data will impact everyone’s life. Having said that, I also think that the term ‘big data’ is not very well defined and is, in fact, not well chosen. I also think the term is completely over-hyped but this just comes with the territory (software vendors and consulting companies need these buzzwords to generate interest and sell new products and services). Let me use this article to explain what’s behind the massive ‘big data’ buzz and hopefully demystify some of the hype.

Introduction to Big Data

Basically, big data refers to our ability to collect and analyze the vast amounts of data we are now generating in the world. The ability to harness the ever-expanding amounts of data is completely transforming our ability to understand the world and everything within it. The advances in capturing and analyzing big data allow us to e.g. decode human DNA in minutes, find cures for cancer, accurately predict human behavior, foil terrorist attacks, pinpoint marketing efforts, prevent diseases and so much more.

You might ask: So what is new here? Haven’t companies and organizations captured and analyzed data for a long time? Yes, but there are three things that are changing at the moment and are making the phenomenon of ‘big data’ real:

The rate at which we are generating new data is frightening – I call this the ‘datafication’ of our world.

We generate more complex forms of data

Our ability to analyze data has been transformed in recent years.

The Complete Datafication of Our World

Day after day our world is filled with more and more data and the pace of the data growth is accelerating week by week. Data on every aspect of our life is now being generated. Here are just some examples that illustrate what I mean by the datafication of our world:

We increasingly leave digital records of our conversations: Emails are stored in corporate systems, our social media up-dates are filed and phone conversations are digitalized and stored.

More and more of our activities are digitally recorded: Most things we do in a digital world leave a data trail. For example, our bowser logs what we are searching for and what websites we visit, websites log how we click through them, as well as what and when we buy, share or like something. When we read digital books or listen to digital music the devices will collect (and share) data on what we are reading and listening to and how often we do so. Or when we make payments using e.g. credit cards the transactions are being logged.

A lot of photos and videos are now digitally captured and stored. Just think of the millions of hours of CCTV footage captured every day. In addition, we take more videos on our smart 
phones and digital cameras leading to around 100 hours of videos being up-loaded to YouTube every minute and something like 200,000 photos added to Facebook every 60 seconds.

Companies and organisations are creating vast repositories of data, keeping a digital record of everything that is going on: Just think of all the data generated daily in our financial systems, stock control systems, ordering systems, sales transaction systems and HR systems. These data repositories are growing by the minute.

We generate data using the ever-growing amounts of smart devices and sensors: Our smart phones track the location of where we are and how fast we are moving, there are sensors in our oceans to track temperatures and currents, there are sensors in our cars that monitor our driving, there are sensors on packaging and pallets that track goods as they are shipped along supply chains. Smart watches, Google Glass and pedometers collect data. For example I wear an Up band that tells me how many steps I have taken, the calories I have burnt each day as well as how well I have slept each night, etc. Many devices are now internet-enabled so that they self-generate and share data. Smart TVs and set-top-boxes, for example, are able to track what you are watching, for how long and even detect how many people sit in front of the TV.

By Bernard Marr read more