In the 1940s, science-fiction author Isaac Asimov began writing his Foundation series of stories. At the heart of these is an early vision of what we now call data analytics: his invented science of “psychohistory”. This is the notion that scientists, using sophisticated mathematics, history and sociology, might accurately predict human events — wars, revolutions, election results — and their outcomes based on the behaviour patterns of large groups of people.
Might the day be dawning when such techniques can be used to fulfil the dream of every marketing guru and political pundit, and accurately predict how large swaths of humanity will behave by analysing social media?
SC2, a Florida-based company, has already worked with IBM, HP Enterprise and Belgium’s Luciad to develop software that monitors social media feeds for signs of potential conflict.
Robert Guidry, a former officer in the US joint special operations command and SC2’s chief executive, says that he is increasingly seeing conflicts and attacks being forecast on social media by the people planning them. He adds that some insurgent groups, such as Isis, are even “unashamed to declare what they are going to do” on social media.
SC2 says its clients are governments and large commercial organisations. Military scenarios feature heavily in videos on the company’s website that demonstrate how its technology is used.
“We can develop models so that when an idea or concept [based on a calculable set of conditions] reaches a particular threshold of volume, or sentiment, or geographic concentration of an idea or theme, we can send an alert,” Mr Guidry claims.
He also says he sees the beginnings of more accurate election predictions about who is going to win in the vast cascade of social media commentary on the 2016 US presidential election. “Polling is dead and just doesn’t know it yet,” he says. “People provide more guarded responses in polling than they do in social media. In social media, people just spout without thinking.”
Not everyone in the data analysis field agrees. Jake Hofman, senior research scientist with Microsoft Research, says his studies suggest that insights derived from social media might not be the best source of information for predicting future human behaviour, and particularly not the results of elections.
“Things are less predictable in social [media] systems than we thought they might be,” he says.
Earlier this year, he and three colleagues published research looking at data gleaned from views expressed on Twitter during the 2012 US presidential election against traditional polling and found the two were not comparable.
“Most existing research counts each of these [social media] engagements independently, ignoring user identity information. If this were a survey, it would be the equivalent of allowing users to respond as many times as they want,” they concluded.
PlaceSpeak is an online public consultancy based in Vancouver which operates across Canada and the US. Colleen Hardwick, its chief executive, says the key to harvesting accurate opinions on politics and public policy is to verify the identity and location of the people whose social media views you want to gather.
Without robust authentication tools, she says, people trying to use analysis of such information to make accurate predictions could well see their results influenced by online behaviours that are “designed to skew and distort public opinion”. Such activities, she says, include: trolling (people who use threatening behaviour against opponents to deter them expressing conflicting views); “sock puppets” (accounts set up to hide users’ identities); and “astroturfing” (orchestrated political campaigns that appear to be from genuine members of the public). These could be used to create an impression of wide grassroots support for views on social media that might not reflect reality.
Ms Hardwick also cites the instant polls on some news websites that followed the recent US presidential debates as an example of how not to find an accurate reflection of public opinion.
What is essential, she says, are systems that allow citizens to prove their identity while ensuring their private data cannot be revealed, so that no one’s privacy is jeopardised by taking part in research.
Paul Russell, director of analytical solutions at information services provider Experian, says turning social media insights into useful actions is no simple task. “Data can help, and is helping, inform our approach to some of society’s biggest problems: famine, disease, poverty and ineffective education. And it is providing useful input to the political processes that are focused on solving these problems.”
However, he says, the answers to such problems lie in the hands of governments rather than data analysis. “Data solutions to these problems are some way off.”
Get alerts on Data mining when a new story is published