An intern at the Trump campaign data firm, Cambridge Analytica, left sensitive voter targeting tools online for nearly a year

Advertisement
An intern at the Trump campaign data firm, Cambridge Analytica, left sensitive voter targeting tools online for nearly a year

Donald Trump

AP Photo/Alex Brandon

President Donald Trump boards Air Force One after saluting the Air Force members, as he departs Wednesday, Oct. 11, 2017, at Andrews Air Force Base, Md.

An intern at the data mining and analysis firm Cambridge Analytica left online for nearly a year what appears to be programming instructions for the voter targeting tools the company used around the time of the election, raising questions about who could have accessed the tools and to what end.

Advertisement

Social media analyst and data scientist Jonathan Albright discovered the election data processing scripts - or programming instructions - on what he said was the intern's personal GitHub account. GitHub, a "Facebook for programmers," is an internet hosting service mostly used for code.

The account was scrubbed after Albright published his findings on Medium, but the scripts had already been archived.

Cambridge Analytica, which mined and analyzed voter data for the Trump campaign last year, did not immediately respond to a request for comment. A LinkedIn account that appears to belong to the intern identified by Albright lists him as a "Data Science Intern" for Cambridge Analytica between March and June of 2016.

The tools the intern appears to have extracted facilitated geolocation targeting, to be used in enriching voter files with GPS coordinates, and Twitter sentiment analysis - essentially, the process of determining someone's position on an issue by analyzing tweets and pulling data from users discussing certain topics.

Advertisement

The tool was used to find and group people on Twitter that talked about, or responded to, specific keywords in retweets.

Albright, who heads Columbia's Tow Center for Digital Journalism and recently published extensive research on Russia's use of Facebook during the election, said Cambridge Analytica's real-time social media mining tool was not necessarily complex or novel in and of itself.

What is more interesting, he said, is how the tool appeared to retrieve people's recent tweets and favorites to "expand" Cambridge Analytica's body of keywords "around specific objects of election 'outrage' sentiment'" - like abortion, citizenship, naturalization, guns, and Planned Parenthood.

Recent reporting has revealed that Russia harnessed and harvested "outrage" sentiment in an attempt to galvanize and sway voters during the campaign. Accounts linked to Russia bought $100,000 worth of Facebook ads between 2015 and 2016, many of which promoted outsider candidates and exploited racial tensions. Similar methods were deployed on Twitter, Google, Instagram, Pinterest - and even Pokemon Go, as CNN reported earlier this week.

Additionally, the intern appeared to have left Cambridge Analytica's Twitter API key online when he uploaded the scripts. Albright said he left what amounts to the account username and password that companies and developers use to search and pull tweets and user profile information from Twitter. The keys were removed in February, Albright said.

Advertisement

Albright said the code for the tools was "sitting right on Github for almost a year: from March 2016 to February 2017 - the last 8 months of the US election."

"That's a security issue, in my opinion," Albright added. "Could Russia find this and use it? Absolutely."

Only Twitter would be able to definitively reveal whether the accidentally copied-and-pasted API key belonged to Cambridge Analytica, according to Albright.

But because of the social media's terms of service and privacy protections for developers, the information could likely only be obtained via a subpoena. The House Intelligence Committee is scrutinizing Cambridge Analytica as part of its investigation into whether any collusion occurred between the campaign and Russia, The Daily Beast reported last week.

Still, Albright said, "showing the actual code in two of their scripts is one of the few pieces of evidence that can break through the noise and puffery around Cambridge Analytica. While code is not a person, it's the ultimate journalistic source for a CA-related election story."

Advertisement

Albright argued in a post on Medium that the question of Cambridge Analytica's ownership - "a foreign business previously registered in the United States as a foreign corporation" - is now more relevant than ever.

"Foreign influence -  sound familiar?" he wrote.

Twitter last week gave the Senate Intelligence Committee the profile names, or "handles," of 201 accounts it believes were operating out of Russia during the election. But Politico reported Friday that much of the data that could be useful in examining the extent of Russia's Twitter operation was deleted by the company.