December 11, 2012 9:43 pm

Lack of data analysis worrisome

The “digital universe” continues to expand rapidly and will double every two years through 2020, but only a small fraction of this ‘big data’ is being analysed, which leaves a huge untapped opportunity for companies and other organisations.

According to a report published on Tuesday by IDC, the market research firm, a large ‘big data gap’ is opening up between the rapidly expanding digital universe – a measure of all the digital data created, replicated and consumed in a single year. That data holds potential analytic value that could have major business, health and societal impact.

Only 0.5 per cent of that data are actually being analysed.

The reason, IDC suggests, is that the majority of new data being generated is unstructured – for example video surveillance footage, audio files or social media – meaning we know little about the data until it is characterised and tagged. The industry currently faces a shortage of the talent and technology needed to tag and analyse unstructured data.

IDC’s sixth annual report on the “Digital Universe,” sponsored by EMC, the storage market leader, notes that today the digital universe comprises “the images and videos on mobile phones uploaded to YouTube, digital movies populating the pixels of our high-definition TVs, banking data swiped in an ATM, security footage at airports and major events such as the Olympic Games, subatomic collisions recorded by the Large Hadron Collider at CERN, transponders recording highway tolls, voice calls zipping through digital phone lines, and texting as a widespread means of communications.

“With the rise of big data awareness and analytics technology, the digital universe in 2012 has taken on the feel of a tangible geography,” notes the report, “vast, barely charted place full of promise and danger.”

This untapped value could be found in patterns in social media usage, correlations in scientific data from discrete studies, medical information intersected with sociological data, faces in security footage, and so on.

However, even with a generous estimate, the amount of information in the digital universe that is “tagged” accounts for only about 3 per cent of the digital universe in 2012, and that which is analysed is half a per cent of the digital universe. “Herein is the promise of “big data” technology – the extraction of value from the large untapped pools of data in the digital universe,” says the report.

The reports authors conclude that “our digital universe in 2020 will be bigger than ever, more valuable than ever, and more volatile than ever.”

Not surprisingly, IDC predicts that big data will be a big boon for the IT industry. “Web sites that gather significant data need to find ways to monetise this asset. Data scientists must be absolutely sure that the intersection of disparate data sets yields repeatable results if new businesses are going to emerge and thrive. Further, companies that deliver the most creative and meaningful ways to display the results of big data analytics will be coveted and sought after.”

Among the report’s main findings it suggests:

 From 2005 to 2020, the digital universe will grow by a factor of 300 – more than 5,200 gigabytes for every man, woman, and child in 2020. From now until 2020, the digital universe will about double every two years.

 The investment in managing, containing, studying and storing the bits in the digital universe will only grow by 40 per cent between 2012 and 2020. As a result, the investment per gigabyte during that same period will drop from $2.00 to $0.20.

 Between 2012 and 2020, emerging markets’ share of the expanding digital universe will grow from 36 per cent to 62 per cent.

 A majority of the information in the digital universe, 68 per cent in 2012, is created and consumed by consumers – watching digital TV, interacting with social media, sending camera phone images and videos between devices and around the Internet, and so on. Yet enterprises have liability or responsibility for nearly 80 per cent of the information in the digital universe.

 Only a small fraction of the digital universe has been explored for analytic value. IDC estimates that by 2020, as much as 33 per cent of the digital universe will contain information that might be valuable if analysed, compared to 23 per cent today.

 By 2020, nearly 40 per of the information in the digital universe will be “touched” by cloud computing providers – meaning that a byte will be stored or processed in a cloud somewhere in its journey from originator to disposal.

 The proportion of data in the digital universe that requires protection is growing faster than the digital universe itself, from less than a third in 2010 to more than 40 per in 2020.

 The amount of information individuals create themselves – for example writing documents, taking pictures, downloading music – is far less than the amount of information being created about them in the digital universe.

Copyright The Financial Times Limited 2014. You may share using our article tools.
Please don't cut articles from FT.com and redistribute by email or post to the web.

SHARE THIS QUOTE