Google Leak Reveals Thousands of Privacy Incidents

An internal Google database obtained by 404 Media shows Google recording childrens' voices, saving license plates from Street View, and many other self-reported incidents, large and small.
Google Leak Reveals Thousands of Privacy Incidents
Image: Unsplash/Mitchell Luo.

Google has accidentally collected childrens’ voice data, leaked the trips and home addresses of car pool users, and made YouTube recommendations based on users’ deleted watch history, among thousands of other employee-reported privacy incidents, according to a copy of an internal Google database which tracks six years worth of potential privacy and security issues obtained by 404 Media.

Individually the incidents, most of which have not been previously publicly reported, may only each impact a relatively small number of people, or were fixed quickly. Taken as a whole, though, the internal database shows how one of the most powerful and important companies in the world manages, and often mismanages, a staggering amount of personal, sensitive data on people's lives.

The data obtained by 404 Media includes privacy and security issues that Google’s own employees reported internally. These include issues with Google’s own products or data collection practices; vulnerabilities in third party vendors that Google uses; or mistakes made by Google staff, contractors, or other people that have impacted Google systems or data. The incidents include everything from a single errant email containing some PII, through to substantial leaks of data, right up to impending raids on Google offices. When reporting an incident, employees give the incident a priority rating, P0 being the highest, P1 being a step below that. The database contains thousands of reports over the course of six years, from 2013 to 2018.

In one 2016 case, a Google employee reported that Google Street View’s systems were transcribing and storing license plate numbers from photos. They explained that Google uses an algorithm to detect text in Street View imagery. “Unfortunately, the contents of license plates are also text and, apparently, have been transcribed in many cases,” the employee wrote. “As a result, our database of objects detected from Street View now inadvertently contains a database of geolocated license plate numbers and license plate number fragments.”

Do you work at Google? Do you know about any other privacy or security incidents? I would love to hear from you. Using a non-work device, you can message me securely on Signal at +44 20 8133 5190.

“I want to emphasize that this was an accident. The system that transcribes these pieces of text should have been avoiding imagery identified by our license plate detectors but, for reasons as-yet unknown, was not,” they added. The report says that the data has been purged.

Another incident involved the public exposure of more than one million users’ email addresses from, a company that Google acquired. The data was viewable in the page source of the company’s website, the report says. Geolocation information and IP addresses of users was also suspected to be available. Those impacted included children. “This exposure has been addressed as part of the closing conditions for this acquisition. However, the data was exposed for > 1yr and could already have been harvested,” the report read.

In a third, a Google speech service logged all audio, including an estimated 1,000 childrens’ speech data, for around an hour. “Estimated 1K child speech utterances was collected. Team deleted all logged speech data from the affected time period,” the report read.

In another incident, a customer of Google’s cloud product which is for government clients who need to protect sensitive data, was inadvertently transitioned to a consumer level product. “As a result of an accidental SKU migration to G Suite for Business, US data location is no longer guaranteed for this customer,” the report says.

In some cases, the reports themselves say that the issue has been fixed. After 404 Media shared the identifying codes of around 30 incidents with Google, the company said that each of them was resolved at the time. 

Some other incidents marked with high priority or are otherwise notable in the database include:

  • A filter that was supposed to stop childrens’ voices from being collected was not correctly applied.
  • A person modified customer accounts on AdWords, what Google’s ad platform was named at the time, to manipulate affiliate tracking codes on ads.
  • The global security team warned that it was expecting a dawn raid of a Google office in Jakarta in April 2017 (a similar incident did happen in September 2016).
  • Waze carpool’s feature leaked the trips and home addresses of other users.
  • A Google employee accessed private videos in Nintendo’s YouTube account, and leaked information ahead of Ninendo’s planned announcements. An internal interview concluded the activity was “non-intentional,” the report says.
  • Sabre, a travel agent that Google uses, was compromised and Google employee payment information was exposed.
  • A quirk in Android’s keyboard meant that children were pressing the microphone button, resulting in Google logging audio from children as part of the launch of the YouTube Kids app.
  • YouTube made recommendations based on videos users had deleted from their watch history, which was against YouTube’s own policy.
  • A YouTube blurring feature was exposing uncensored versions of pictures.
  • When iOS users of Google Drive or Docs set access controls on a file as “Anyone with the link,” Google actually treated it as a “Public” link.
  • YouTube videos uploaded as unlisted or private could appear publicly available for a short time.

Google told 404 Media in a statement “At Google employees can quickly flag potential product issues for review by the relevant teams. When an employee submits the flag they suggest the priority level to the reviewer. The reports obtained by 404 are from over six years ago and are examples of these flags—every one was reviewed and resolved at that time. In some cases, these employee flags turned out not to be issues at all or were issues that employees found in third party services.”

404 Media obtained the large dataset from an anonymous tipster who did not provide their real name or identity. 404 Media then verified the veracity of the dataset; Google also confirmed aspects of its contents.