For Hasso Plattner, co-founder and chairman of SAP, the world’s largest business software maker by sales, unlocking the full potential of big data is all about speed.
Mr Plattner believes complex business analysis involving both structured data (such as point-of-sale data) and unstructured data (such as social media streams and video) that today can take a day or more to process, will soon run in less than a second thanks to a technology he helped to invent called in-memory computing.
Mr Plattner and SAP have pioneered the concept of in-memory computing – storing data in main memory that is directly accessible by the processor or central processing unit. Most databases use much slower disk storage.
In SAP’s case, the database technology and tools behind in-memory computing are known as Hana (Hasso’s New Architecture) and, in the face of some doubters, SAP is betting it can remake its business around it.
“I came to the conclusion a while ago that when we build next-generation enterprise systems (running against both structured and unstructured data), they have to have a response time of around a second,” said Mr Plattner in a recent interview.
He explains: “We have customers that have [analytics] programs that [take] longer than a day [to run]. In most cases, we could cut this down to under 10 seconds, and with linear extrapolation [of data], we can see that in the next three years we might be able to achieve every single thing we do today in around a second.
“That sounds ridiculous, but it’s technically possible. Basically, for a human eye or for a human reaction, this is real time. It is a question of investment, but it’s not outrageous… several million, or probably two-digit millions for a very large company, but much less than what they pay for their computers today.”
While in-memory computing is technically challenging, the concept behind it is relatively simple. As Alan Bowling, chairman of the independent UK & Ireland SAP User Group, explained recently to Computer Weekly: “Traditionally, data will be placed in storage then, when needed, will be accessed and acted upon in the computer’s memory.
“This results in a natural bottleneck that reduces speed – even with the fastest SSD [solid-state] hard drives, there will still be a gap where data must be accessed, transferred to memory and then returned so the next batch of data can be used.”
As volumes of data increase, so the time needed simply for access, let alone analysis, increases too. In-memory computing takes advantage of a better understanding of how data are shaped and stored, the constantly falling price of memory and the related greater affordability of faster solid-state memory to do away with the traditional concept of storage.
“Instead, data [are] stored directly in the computer’s memory. As a result, when [they] need to be analysed [they are] already available and can be accessed near-instantaneously,” Mr Bowling said.
That means instead of analysing data that may be days or weeks old, or settling for a subset of the full data set because of the time it would take to analyse the full set, companies can analyse and make decisions almost immediately.
The power of in-memory computing also allows companies to process much more complex data, including unstructured data, rather than having first to reformat it into a structured database.
The need for this type of computing power and performance was highlighted in a recent report from Aberdeen Research, which noted that business data were growing at an average of 36 per cent a year. “Organisations of all sizes, across all countries and industries, have to face the challenge of scaling up their data infrastructure to meet this new pressure,” Aberdeen noted. In a 2011 study on the state of big data, Aberdeen's research showed that organisations that had adopted in-memory computing were not only able to analyse larger amounts of data at faster speeds than their competitors – they were also literally orders of magnitude better.
“This technology is crucial for combating big data issues that require advanced data manipulation around increasingly performance-intensive applications,” says Charlotte Dunlap, a senior analyst at Current Analysis, an analytics provider.
SAP says its customers are finding a range of uses for the technology, from real-time analysis of the human genome in cancer treatment to the mapping of stars to the monitoring of changing customer sentiment on social media and undertaking predictive analysis.
For example, Colgate-Palmolive, the toiletries company, is using Hana to improve profitability analysis reporting, enabling the company to complete intensive calculations on growing product data volumes in record times.
Similarly, Centrica, the UK energy group, is using Hana to perform complex smart energy meter data analysis and to help the company’s customers become more energy efficient.
One of Mr Plattner’s favourite examples is Nongfu Spring, a Chinese spring-water distributor. “China has a huge transportation problem because it is as big as Europe,” he explains. “Nongfu’s transportation optimisation program used to run for several hours. It’s now down to 3.5 seconds. Nongfu [has] saved 35 per cent [of its] transportation costs.”
The leap in speed and performance claimed for Hana has made it one of SAP’s fastest growing new products in years and has led other enterprise software makers, including Oracle, IBM, Software AG and, most recently, Microsoft with its Hekaton project, to follow suit.
It seems in-memory computing is here to stay and will form an increasingly important part of companies’ efforts to take advantage of big data analytics.