Google engineers have built a photo recognition system that can outperform the most well-travelled humans
Humans typically struggle to determine where generic photos were taken just by looking at them. If shown a picture of a white sandy beach, for example, they might assume it was taken in the Caribbean when in fact it was taken in the Maldives.
While many humans need a landmark to refer to - such as the Statue of Liberty or Machu Pichu - before they can pinpoint a location, Google's PlaNet system, which is still in its early stages, does not have this problem.
Tobias Weyand and James Philbin, a pair of software engineers at Google, teamed up with developer Ilya Kostrikov to build the PlaNet system. "We think PlaNet has an advantage over humans because it has seen many more places than any human can ever visit and has learnt subtle cues of different scenes that are even hard for a well-travelled human to distinguish," Weyand told MIT Technology Review.
Weyand's team divided the world into a grid made up of 26,000 squares of varying size, depending on the number of images taken in that location. Each square represented a specific geographical area.
The team then created a database of geolocated images from the internet to determine the gid square in which each image was taken. Overall, 126 million images were used.
Weyand and his team took 91 million of these images to teach a powerful neural network - a computer system modelled on the human brain - to work out the grid location using only the image itself. Ultimately they want to be able to put an image into the neural net and get out a particular grid location or at least a set of likely candidates. The neural network was validated with the remaining 34 million photos in the data set.
In order to test PlaNet, the Google team took 2.3 million geotagged images from online photo library Flickr and asked PlaNet to identify their location.
"PlaNet is able to localise 3.6% of the images at street-level accuracy and 10.1% at city-level accuracy," Weyand's team wrote in their academic paper.
The results weren't perfect but PlaNet still outperformed some of the most well-travelled humans on a Google Street View test.
On average, PlaNet guessed where a photo was taken to within 1,131.7km, while 10 well-travelled humans were only able to guess to within 2,320.75km, on average.
"In total, PlaNet won 28 of the 50 rounds with a median localisation error of 1131.7 km, while the median human localisation error was 2320.75 km," Weyand's team wrote. "[This] small-scale experiment shows that PlaNet reaches superhuman performance at the task of geolocating Street View scenes."
- I spent $2,000 for 7 nights in a 179-square-foot room on one of the world's largest cruise ships. Take a look inside my cabin.
- Saudi Arabia wants China to help fund its struggling $500 billion Neom megaproject. Investors may not be too excited.
- Colon cancer rates are rising in young people. If you have two symptoms you should get a colonoscopy, a GI oncologist says.
- Catan adds climate change to the latest edition of the world-famous board game
- Tired of blatant misinformation in the media? This video game can help you and your family fight fake news!
- Tired of blatant misinformation in the media? This video game can help you and your family fight fake news!
- JNK India IPO allotment – How to check allotment, GMP, listing date and more
- Indian Army unveils selfie point at Hombotingla Pass ahead of 25th anniversary of Kargil Vijay Diwas