Google engineers have built a photo recognition system that can outperform the most well-travelled humans
Humans typically struggle to determine where generic photos were taken just by looking at them. If shown a picture of a white sandy beach, for example, they might assume it was taken in the Caribbean when in fact it was taken in the Maldives.
While many humans need a landmark to refer to - such as the Statue of Liberty or Machu Pichu - before they can pinpoint a location, Google's PlaNet system, which is still in its early stages, does not have this problem.
Tobias Weyand and James Philbin, a pair of software engineers at Google, teamed up with developer Ilya Kostrikov to build the PlaNet system. "We think PlaNet has an advantage over humans because it has seen many more places than any human can ever visit and has learnt subtle cues of different scenes that are even hard for a well-travelled human to distinguish," Weyand told MIT Technology Review.
Weyand's team divided the world into a grid made up of 26,000 squares of varying size, depending on the number of images taken in that location. Each square represented a specific geographical area.
The team then created a database of geolocated images from the internet to determine the gid square in which each image was taken. Overall, 126 million images were used.
Weyand and his team took 91 million of these images to teach a powerful neural network - a computer system modelled on the human brain - to work out the grid location using only the image itself. Ultimately they want to be able to put an image into the neural net and get out a particular grid location or at least a set of likely candidates. The neural network was validated with the remaining 34 million photos in the data set.
In order to test PlaNet, the Google team took 2.3 million geotagged images from online photo library Flickr and asked PlaNet to identify their location.
"PlaNet is able to localise 3.6% of the images at street-level accuracy and 10.1% at city-level accuracy," Weyand's team wrote in their academic paper.
The results weren't perfect but PlaNet still outperformed some of the most well-travelled humans on a Google Street View test.
On average, PlaNet guessed where a photo was taken to within 1,131.7km, while 10 well-travelled humans were only able to guess to within 2,320.75km, on average.
"In total, PlaNet won 28 of the 50 rounds with a median localisation error of 1131.7 km, while the median human localisation error was 2320.75 km," Weyand's team wrote. "[This] small-scale experiment shows that PlaNet reaches superhuman performance at the task of geolocating Street View scenes."
- I got a $40K raise using this 30-second strategy. It made me realize loud work, not hard work, always wins.
- A millennial manager went viral after her Gen Z assistant picked up a work call while at the hair salon: 'Go off queen'
- Qatar Airways' new CEO explains why it's sticking with the Airbus A380 as other airlines retire the costly superjumbo
- Kia India looks to expand sales, service network to 700 touchpoints by year-end
- Shapoorji Pallonji’s Afcons Infra files DRHP for ₹7,000 crore IPO
- Water crisis affects businesses across Bengaluru; Is there room for cautious optimism?
- BenQ Zowie EC2-CW review – Premium wireless mouse for gamers
- Banks' GNPAs set to improve further to 2.1 pc by FY25: Care Ratings