Listen to this article
How smart do you want your smartphone to be? In designing Cortana, the voice-activated “virtual assistant” built into its mobile software, Microsoft is betting that most people are not yet ready to hand too much control of their lives to an artificial brain.
A soft-voiced presence with a slightly sassy attitude drawn from a video game character, Cortana is quite capable of reading your email to see if you have a flight coming up, then using the information to tell you when it is time to leave for the airport.
But Microsoft will not let “her” take the liberty. Instead, the system asks permission, like a discreet human assistant who does not want to assume too much — a step that also helps to confirm the software is on the right track in anticipating your wishes.
“At the moment it’s progressive intelligence, not autonomous intelligence,” says Marcus Ash, group program manager for Cortana, which is enabled on phones with the Windows operating system, including Microsoft’s Lumia devices. People do not want to be surprised by how much their phones are starting to take over, he says: “We made an explicit decision to be a little less ‘magical’ and a little more transparent.”
Niceties like this could soon be a thing of the past. The race is on between some of the biggest tech companies to come up with omniscient guides capable of filtering the complex digital world.
Like the browser wars of the 1990s, the outcome will help to set the balance of power in the next phase of the internet. By channelling attention and making decisions on behalf of their users, virtual assistants will have enormous power to make or break many other businesses. Many companies — from carmakers to entertainment concerns — aim to develop voice-powered assistants of their own to keep their customers loyal. But the future may belong instead to a handful of all-knowing assistants, much as Google’s search engine managed to suck in so many of the world’s queries on the web.
Though it has not reached the point of mass adoption yet, the potential of this new form of artificial intelligence has all the tech companies scrambling.
Along with Microsoft, they include Apple, with the Siri question-and-answer service on its iPhones and iPads, Google, whose Google Now service tries to anticipate its users’ information needs, as well as Facebook, Amazon and Baidu, which are experimenting with mobile and desktop applications.
The tech world is in the grip of an “AI spring”, says Oren Etzioni, head of the artificial intelligence research institute backed by Microsoft co-founder Paul Allen. Of all the new fronts this has opened up, the development of virtual assistants has become “one of the most exciting,” he says.
The consequences will stretch far beyond the ability to speak a question into your smartphone and get an immediate answer, or to have warnings about heavy traffic “pushed” to you at the time you normally leave for work. If the virtual assistants catch on, they are likely to take over many of the decisions that litter everyday life as they try to anticipate their users’ needs. And they are likely to have minds of their own.
“It’s really a meta-layer of machine intelligence that’s sitting on top of all the services,” says Mitch Lasky, a partner at Silicon Valley venture capital firm Benchmark and an investor in mobile software company Cyanogen.
“If I ask Siri to find me a taxi, it may not go to Uber — and I may not be able to force it to.” For the many companies that have come to rely on customers accessing their services over smartphones, the consequences could be significant.
Given the limitations that virtual assistants have shown up to now, it might seem premature to start worrying about a time when they suddenly take it on themselves to pre-empt their users’ needs and desires.
Apple’s launch of Siri in 2011 was a catchy demonstration of advances in speech recognition software and question-and-answer systems. But it also served as a demonstration of the limits of the technology, as users quickly found that the system could only handle certain types of query.
“The industry is still suffering from the hangover from the Siri let-down,” says Tim Tuttle, chief executive of Expect Labs, a company that makes a voice-activated assistant that can be used to power other companies’ applications. He and other experts in the field say that the technology has come a long way in a short space of time.
Virtual assistants rely on a number of different technologies. One involves speech recognition. Google says it has reduced the error rate for recognising words in its own mobile app to less than 8 per cent — a level at which it says the service has become a practical alternative to entering text.
Apple has also made considerable headway in overcoming the early disappointment by adding more capabilities to Siri and helping users better understand what it can and cannot do.
A second front has involved the predictive technology for anticipating what a user will want to know next. This draws on contextual data — aspects such as the location and time of day — as well as personal information. Knowing what other people have found useful is also valuable, says Aparna Chennapragada, a director of product management for Google’s mobile app: “For most people in your situation, what are the top five things you need?”
The tech companies do not release much information about the extent to which the virtual assistants are catching on. But what there is suggests that users are starting to show interest — even though, as Mr Etzioni cautions, it is probably too early to call it “an inflection point” for the technology.
Chinese search company Baidu, for instance, puts the proportion of its searches that are conducted by voice at nearly 10 per cent. And Google says the number of times its users spoke a search query into their smartphones doubled last year, though it does not disclose a figure.
The pick-up in use has some predicting that it will not be long before talking to gadgets becomes second nature — particularly as computer intelligence spreads to many new types of device where entering text is impractical, from smartwatches to televisions and cars.
Voice interactions like these “will be the gateway for all applications and the way into everything,” says Mr Tuttle. If virtual assistants live up to the industry’s hopes, then whoever has control of this new gateway will end up with considerable power.
Bespoke or all-round assistance?
A key question determining the outcome will be whether the general-purpose digital assistants of the tech giants become dominant in steering people through this new world, or whether users will gravitate to more narrowly focused intelligent guides to help them navigate specialist applications.
The big tech companies “want one intelligent interface to rule them all”, says Mr Tuttle. But that “is not what all the other app makers and device makers want at all.”
Some companies are already starting to experiment with embedding virtual assistants inside their own mobile apps. In the US, pizza chain Domino’s has even run a television advertising campaign to promote “Dom”, a digital assistant that guides customers through the process of ordering a pizza on their smartphones.
Like others trying out the technology, Domino’s hopes that personality will play a part in drawing in customers: “He’s fun, but very focused on the pizza ordering experience,” Dennis Maloney, head of multimedia marketing, said when Dom was unveiled to the world.
“Consumers want to interact with brands on their phones,” he predicted, leading general-purpose assistants such as Siri to “redirect” users to specialist guides like Dom when they want to deal with a particular company.
“The individual app makers can always understand and navigate content in their own apps better,” adds Mr Tuttle. His company’s technology is already being used to help TV viewers search through video-on-demand services and will soon be embedded in cars, whose makers he describes as loath to surrender their customers to Google and Apple.
But as ways of interacting with smartphones and the many new devices of the “internet of things” change, that kind of brand loyalty may start to erode.
Mr Etzioni describes the kind of instruction that the next generation of personal assistants will be asked to handle: “Book me a table for three in an excellent restaurant downtown with easy parking.”
To fulfil that desire, the technology would need not only a high level of natural language recognition, but the ability to draw information from different services (a restaurant review site, a maps app) and issue instructions through a reservation system.
This kind of “more sophisticated service composition” could start to appear on smartphones in a little more than a year from now, he adds. This points to a post-app world, where pieces of information and service components that currently reside in standalone apps are stitched together to resolve life’s many small, daily problems.
For users, it would happen invisibly: they might not know, or care, which services had been called on to handle their requests — though the company determining which apps to call on to fulfil their needs would hold considerable sway.
“The acute need to find the right app will become secondary to the need to complete the task,” says Dag Kittlaus, one of the creators of Siri. Mr Kittlaus is now co-founder and chief executive of Viv Labs, a start-up that is trying to build an assistant to handle tasks like this.
“Intelligent assistants will likewise break free of the notion of an app,” Mr Kittlaus predicts. “Your car can be intelligent. Your refrigerator can be intelligent. Your phone can be intelligent.” As a result, there will be no need to find an app, tap to open it and search inside for what you need: “If you need something, you will simply ask for it.”
Like many ideas in tech, this combining of services on the fly like this is not new. At the height of the last tech boom, Microsoft imagined achieving something similar with a proprietary technology it called Hailstorm. The idea, coming at the height of the company’s power, terrified many in the tech world, who saw it as a way to force others to accept Microsoft’s terms of trade for all digital interactions.
Outsiders such as Mr Etzioni now credit the software company with having developed a more open view of the world. “It’s an underdog thing — we’re a little bit more humbled than Microsoft was in the past,” says Mr Ash. “Cortana is not disruptive from the point of view of services,” he adds, but is designed to act as an open platform.
Pulling in data
The terms on which app makers would allow their services to interact have yet to be determined, but the first elements of the technology to make this possible are starting to emerge. Last month, Google released an experimental API that lets other companies insert their own content into its Google Now service, so that their information can also be “pushed” at users when it is most likely to be needed.
Most of the big tech companies are also working on methods for “deep linking” content inside standalone apps to make it possible to automatically find — and use — the information in other contexts. This may not turn the world of closed apps into the wide-open world of the hyperlinked web, but it still points to a time when the walls between apps will start to break down.
In this new world, information would flow continuously, reflecting the nature of computing when users are surrounded by always-connected devices, says Google’s Ms Chennapragada: the idea of opening a web page to read something would feel very old-fashioned.
Feeding the process would be new types of real-time and “actionable” information — the kind that helps virtual assistants make quick decisions as they assume more of their users’ everyday tasks. Offering up the right nugget of information or the right offer at the right time will be key to oiling the wheels of this new economy.
For many publishers and ecommerce companies this would be a very different world, requiring entirely new methods to reach customers. But if the Cortanas and Siris end up ruling the digital roost, they are likely to have little choice.