Listen to this article
Google has unveiled a system that attempts to pinpoint the location of where a photograph was taken by analysing the image, as the internet group continues to experiment with advanced “machine learning” technologies.
Though at its early stages, the Californian company’s system is another example of how Silicon Valley groups are making giant strides in artificial intelligence, using the ability to crunch huge amounts of data and spot patterns to develop capabilities far beyond human brains.
Facebook, IBM and a number of start-ups are working on AI features to power devices such as digital assistants, healthcare apps that can diagnose medical conditions and the software that can power driverless cars.
Google’s latest experiment attempts solve a task that most humans find difficult: looking at a picture at random and trying to work out where it was taken.
Humans are able to make rough guesses on where a shot has been taken based on clues in the picture, such as the type of trees in background and the architectural style of buildings. This task has proven beyond most computer systems.
This week, Tobias Weyand, a computer vision specialist at Google, unveiled a system called PlaNet, that is able to decipher where a photograph has been taken by analysing the pixels it contains.
“We think PlaNet has an advantage over humans because it has seen many more places than any human can ever visit and has learnt subtle cues of different scenes that are even hard for a well-travelled human to distinguish,” Mr Weyand told MIT Technology Review, which first reported the news.
His team divided the world into a grid containing 26,000 squares — each one representing a specific geographical area.
For every square, the scientists created a database of images derived from the internet that could be identified by their “geolocation” — the digital signatures that show where many photographs are taken. This database was made up of 126m images.
Using this information, the team would teach a neural network — a computer system modelled on how layers of neurons in the brain interact — to place each image to a specific place.
Mr Weyand’s team plugged 2.3m geotagged images from Flickr, the online photo library, to see whether the system could correctly determine their location.
Though this means it is far from perfect, this performance is far better than humans. According to the team’s findings, the “median human localisation error” — meaning the median distance from where a person guessed the location of a picture, to where it was actually taken — is 2,320.75km. PlaNet’s median localisation error is 1,131.7km.
The news came as Google’s Deepmind group, its artificial intelligence arm based in the UK, announced its first push into medical technology on Wednesday.
The company has created the healthcare project following the acquisition of Hark, a health tech start-up which has created digital tools for patients.
One of its first ventures is “Streams”, an app that aims to give nurses and doctors timely information about patients to help detect cases of acute kidney injury.