SAN FRANCISCO (WASHINGTON POST) - Facebook said on Wednesday (April 4) that most of its two billion users likely have had their public profiles scraped by outsiders without the users’ explicit permission, dramatically raising the stakes in a privacy controversy that has dogged the company for weeks, spurred investigations in the United States and Europe, and sent the company’s stock price tumbling.
Data scraping is automated data extraction or harvesting. Modern tools allow any data publicly available on the Internet to be collected and presented in organised formats such as spreadsheets for analysis.
For instance, data scraping tools can mine for people's names, phone numbers and ages on web pages. The tools can also harvest data on price changes and public sentiments on a certain topic for analysis later.
Facebook's acknowledgement was part of a broader disclosure by the social media giant on Wednesday about the ways in which various levels of user data have been taken by everyone from malicious actors to ordinary app developers.
“We’re an idealistic and optimistic company, and for the first decade, we were really focused on all the good that connecting people brings,” chief executive Mark Zuckerberg said on a call with reporters on Wednesday afternoon.
“But it’s clear now that we didn’t focus enough on preventing abuse and thinking about how people could use these tools for harm as well.”
As part of the disclosure, Facebook for the first time detailed the scale of the improper data collection for Cambridge Analytica, a political data consultancy hired by President Donald Trump and other Republican candidates in the last two federal election cycles.
The political consultancy gained access to Facebook information of up to 87 million users, 71 million of whom are Americans, Facebook said.
Cambridge Analytica obtained the data to build “psychographic” profiles that would help deliver targeted messages intended to shape voter behaviour in a wide range of US elections.
But in research sparked by revelations from a Cambridge Analytica whistleblower last month, Facebook determined that the problem of third-party collection of user data was far larger still and, with the company’s massive user base, likely affected a large cross-section of people in the developed world.
“Given the scale and sophistication of the activity we’ve seen, we believe most people on Facebook could have had their public profile scraped,” the company wrote in its blog post.
The scraping by malicious actors typically involved gathering public profile information – including names, e-mail addresses and phone numbers, according to Facebook – by using a “search and account recovery” function that Facebook said it has now disabled.
The data obtained by Cambridge Analytica was more detailed and extensive, including the names, home towns, work and educational histories, religious affiliations and Facebook “likes” of users, among other data. Other users affected were in countries including the Philippines, Indonesia, Britain, Canada and Mexico.
Facebook initially had sought to downplay the problem, saying in March only that 270,000 people had responded to a survey on an app created by the researcher in 2014.
That netted Cambridge Analytica the data on the friends of those who responded to the survey, without their permission. But Facebook declined to say at the time how many other users may have had their data collected in the process.
The whistleblower, Christopher Wylie, a former researcher for the company, said the real number of affected people was at least 50 million.
Cambridge Analytica on Wednesday responded to Facebook’s announcement by saying that it had licensed data on 30 million users. Facebook banned Cambridge Analytica from its platform last month for obtaining the data under false pretences.
Facebook’s announcement, made near the bottom of a blog post on Wednesday afternoon on plans to restrict access to data in the future, underscores the severity of a data mishap that appears to have affected about one out of every four Americans and sparked widespread outrage at the carelessness of the company’s handling of information on its users. Personal data on users and their Facebook friends was easily and widely available to developers of apps before 2015.
With its moves over the past week, Facebook is embarking on a major shift in its relationship with third-party app developers that have used Facebook’s vast network to expand their businesses. What was largely an automated process will now involve developers agreeing to “strict requirements”, the company said in its blog post on Wednesday.
The 2015 policy change curtailed developers’ abilities to access data about people’s friends networks but left open many loopholes that the company tightened on Wednesday.
The news quickly reverberated on Capitol Hill, where lawmakers are set to grill Zuckerberg at a series of hearings next week.
“The more we learn, the clearer it is that this was an avalanche of privacy violations that strike at the core of one of our most precious American values – the right to privacy,” said Democratic Senator Ed Markey, who serves on the Senate Commerce Committee, which has called on Zuckerberg to testify at a hearing next week.
“This latest revelation is extremely troubling and shows that Facebook still has a lot of work to do to determine how big this breach actually is,” said Democratic Representative Frank Pallone, the top Democrat on the House Energy and Commerce Committee, which will hear from Zuckerberg on Wednesday.
“I’m deeply concerned that Facebook only addresses concerns on its platform when it becomes a public crisis, and that is simply not the way you run a company that is used by over two billion people,” he said.
“We need to know how they are going to fix this problem next week at our hearing.”
Facebook announced plans on Wednesday to add new restrictions to how outsiders can gain access to this data, the latest steps in a years-long process by the company to improve its damaged reputation as a steward of the personal privacy of its users.
Developers who in the past could get access to people’s relationship status, calendar events, private Facebook posts, and much more data, will now be cut off from access or be required to endure a much stricter process for obtaining the information.
Cambridge Analytica, which collected this information with the help of Cambridge University psychologist Aleksandr Kogan, was founded by a multimillion-dollar investment by hedge-fund billionaire Robert Mercer and headed by his daughter, Rebekah Mercer, who was the company’s president, according to documents provided by Wylie.
Serving as vice-president was conservative strategist Steve Bannon, who also was the head of Breitbart News. He has since left both jobs and also his post as top White House adviser to Trump.
Until Wednesday, apps that let people input a Facebook event into their calendar could also automatically import lists of all the people who attended that event, Facebook said.
Administrators of private groups, some of which have tens of thousands of members, could also let apps scrape the Facebook posts and profiles of members of that group.
App developers who want this access will now have to prove their activities benefit the group.
Facebook will now need to approve tools that businesses use to operate Facebook pages. A business that uses an app to help it respond quickly to customer messages, for example, will not be able to do so automatically.
Developers’ access to Instagram will also be severely restricted.
Facebook is banning apps from accessing users’ information about their religious or political views, relationship status, education, work history, fitness activity, book reading habits, music listening and news reading activity, video watching and games. Data brokers and businesses collect this type of information to build profiles of their customers’ tastes.
Facebook last week said it is also shutting down access to data brokers who use their own data to target customers on Facebook.
Facebook’s broad changes to how data is used apply mostly to outsiders and third parties.
Facebook is not limiting the data the company itself can collect, nor is it restricting its ability to profile users to enable advertisers to target them with personalised messages.
One piece of data Facebook said it would stop collecting was the time of phone calls, a response to outrage from users of Facebook’s messenger service who discovered that allowing Facebook to access their phone contact list was giving the company access to their call logs.