Researchers can work out your location based on who you talk to on Twitter
A paper from Ryan Compton, David Jurgens and David Allen explains a new method for tracking the location of Twitter users to around 6km based on who they interact with. Using the method, the researchers say, they're able to "geotag over 80% of public tweets."
Using the small subset of Twitter users who do provide GPS data or unambiguous location information, the algorithm is then able to "assign a location to a user based on the location of their friends," building up a geotagged map of users across the social network.
The method isn't perfect. It's based on the principal that the "vast majority of Twitter users @mention with geographically close users," and there are some people with more global networks of connections that the method cannot generate accurate results for. But they say their method was "accurate to city-resolution" for 89.7% of test users.
The researchers point to an array of sociological and scientific benefits of being able to track Twitter users who don't provide location data. These range from "understanding regional flu trends, linguistic patterns, election forecasting, [and] social unrest," to helping plan "disaster response."
There are also obvious commercial benefits to the research. It's a boon to advertisers: Local businesses will be able to target ads far more effectively at people identified as in the area.
But there's a more sinister side to the findings too. It shows how it's possible to accurately track a users' home location based only on who they interact with, even if they've expressly opted out of having their location tracked.
This is a testament to the power of "metadata," the additional information associated with communications beyond the content itself. (For example, an email's metadata would include the sender and addressee, and the time it was sent.) While previous studies on Twitter user location have tried to analyse language used for clues to location, the Cornell team ignored the content of the tweets altogether.
"Language-based geotagging models often rely on sophisticated language-specific natural language processing," they write, "and are thus difficult to extend worldwide."
The researchers believe they have produced "the largest and most accurate dataset of Twitter user locations" ever, and they've done so by relying on the interactions between users alone to build networks. But by combining their metadata approach with linguistic analysis, it opens the door to ever-more invasive location tracking.
Read the full research paper here >
- I got a $40K raise using this 30-second strategy. It made me realize loud work, not hard work, always wins.
- Qatar Airways' new CEO explains why it's sticking with the Airbus A380 as other airlines retire the costly superjumbo
- Prince Harry and Meghan found out about Kate Middleton's cancer diagnosis on TV like everyone else, report says
- Sustainable Event Planning
- Ambani, Adani collaborate: RIL picks 26% stake in Adani Power project
- As back-to-office avatars turn casual, comfy sneaker sales pick up pace
- Fresh photographs of Milky Way’s black hole Sgr A* reveal strong, twisted magnetic field similar to M87*
- 8 Lesser-known places to explore in Himachal Pradesh