Listen to this article
As a teenager in the 1990s, Oriol Vinyals became Spain’s national champion at StarCraft. He liked the sci-fi strategy game — set in the 26th century in the depths of the Milky Way — because it requires more thought than most other shoot-’em-ups. “It’s not only about playing a lot of hours, it’s about strategic thinking,” says Mr Vinyals. “Even pre-university, it made me a bit more strategic about certain things in life.”
His strategic thinking paid off: after studying telecom engineering and maths in Barcelona, Mr Vinyals did a stint at Microsoft Research, gained a PhD in computer science at Berkeley, and began working for Google on artificial intelligence.
He has picked up StarCraft again, not as a player but to teach robots to play. It is the next gaming challenge for AI after a self-taught computer became the world’s best player of Go.
Google’s DeepMind unit is teaming up with Blizzard Entertainment, StarCraft’s developer, to allow AI researchers to learn from millions of replays of previous battles.
One objective is to create an AI system good enough to beat the best human players, just as Lee Se-dol, the Go world champion, was bested by DeepMind’s AlphaGo last year. Yet the ultimate goal is to deploy the learning behind that feat into real-world applications rather closer to home than the Koprulu sector in StarCraft.
“We are trying to understand people and how our brains work,” says Jacob Repp, a principal software engineer at Blizzard. “If we can get this high-quality data stream — the raw human input that someone did this and it created this result — that’s really useful data for people in behavioural research.”
StarCraft II, the “real-time strategy” game on which the research is being conducted, presents an appealing challenge to AI researchers.
Unlike chess or Go, Starcraft players have imperfect information. This “fog of war” means players (real or virtual) must plan, make decisions or respond to actions that will only have consequences minutes later. The result, as DeepMind’s researchers put it, is a “rich set of challenges in temporal credit assignment and exploration”.
Blizzard already uses neural networks to assess a player’s skill, based on inputs from their keyboard and mouse, how they marshal the hundreds of units in their armies, and how efficiently they play. These signals can be used to make the game itself more fun or balance opponents evenly.
For the AI to play StarCraft, though, it has to be able to “see” the 3D map inside the game and interpret it quickly and accurately.
DeepMind’s first test involved taking neural networks and AI agents trained on simpler Atari games and dropping them into Starcraft. Even without many further instructions, the Atari-trained AI could click around the map, move the camera and deploy units. “It does work to some extent,” says Mr Vinyals.
Before moving to DeepMind, Mr Vinyals developed features for image search and Gmail’s “smart reply”, which suggests relevant responses based on the content of any given email. That team also worked on voice recognition, which involves the AI remembering how different people talk so they can recognise the sounds next time.
“In StarCraft you have these problems as well,” Mr Vinyals says. A player might see one of an opponent’s scouts before losing sight of it again. For AI, remembering that encounter and understanding that it might indicate where the enemy is building a base involves neural networks with “long short-term memory”. Computers have been able to remember data for decades, Mr Vinyals explains, but this kind of memory requires not just storing but acting on the information later.
“In StarCraft, this is critical but it’s very subtle, connecting the past with the future,” he says. “The cause and effect is very hard because there are many things happening in the game.”
It is still early days for StarCraft’s robotic players. At a recent contest in Seoul, Song Byung-gu, a professional StarCraft player, easily beat four AI bots in less than half an hour. Mr Song told the MIT Technology Review that the bots took a different approach to the game but conceded that their defensive play was “stunning at some points”.
Despite his experience with StarCraft, Mr Vinyals says DeepMind’s research assumes no prior knowledge. With so-called reinforcement learning, neural networks simply interpret the raw signals they are fed — in this case, the replays of Starcraft battles.
Despite not playing much for the best part of two decades, the former Spanish champion is confident about his StarCraft skills. “Could the AI beat me yet? I don’t think so,” he laughs.