The “digital universe” continues to expand rapidly and will double every two years through 2020, but only a small fraction of this ‘big data’ is being analysed, which leaves a huge untapped opportunity for companies and other organisations.
According to a report published on Tuesday by IDC, the market research firm, a large ‘big data gap’ is opening up between the rapidly expanding digital universe – a measure of all the digital data created, replicated and consumed in a single year. That data holds potential analytic value that could have major business, health and societal impact.
Only 0.5 per cent of that data are actually being analysed.
The reason, IDC suggests, is that the majority of new data being generated is unstructured – for example video surveillance footage, audio files or social media – meaning we know little about the data until it is characterised and tagged. The industry currently faces a shortage of the talent and technology needed to tag and analyse unstructured data.
IDC’s sixth annual report on the “Digital Universe,” sponsored by EMC, the storage market leader, notes that today the digital universe comprises “the images and videos on mobile phones uploaded to YouTube, digital movies populating the pixels of our high-definition TVs, banking data swiped in an ATM, security footage at airports and major events such as the Olympic Games, subatomic collisions recorded by the Large Hadron Collider at CERN, transponders recording highway tolls, voice calls zipping through digital phone lines, and texting as a widespread means of communications.
“With the rise of big data awareness and analytics technology, the digital universe in 2012 has taken on the feel of a tangible geography,” notes the report, “vast, barely charted place full of promise and danger.”
This untapped value could be found in patterns in social media usage, correlations in scientific data from discrete studies, medical information intersected with sociological data, faces in security footage, and so on.
However, even with a generous estimate, the amount of information in the digital universe that is “tagged” accounts for only about 3 per cent of the digital universe in 2012, and that which is analysed is half a per cent of the digital universe. “Herein is the promise of “big data” technology – the extraction of value from the large untapped pools of data in the digital universe,” says the report.
The reports authors conclude that “our digital universe in 2020 will be bigger than ever, more valuable than ever, and more volatile than ever.”
Not surprisingly, IDC predicts that big data will be a big boon for the IT industry. “Web sites that gather significant data need to find ways to monetise this asset. Data scientists must be absolutely sure that the intersection of disparate data sets yields repeatable results if new businesses are going to emerge and thrive. Further, companies that deliver the most creative and meaningful ways to display the results of big data analytics will be coveted and sought after.”