People hired by Google have been listening to snippets of conversations users have with their voice-activated Google devices such as Google Home, but the technology giant has clarified that these snippets are not associated with user accounts.
Google said such recorded conversations are necessary to improve its software's understanding of various languages, and are non-identifiable.
A journalist with Belgian public broadcaster VRT, Mr Tim Verheyden, reported on Wednesday that he had gained access to more than 1,000 snippets from a contractor paid to review audio captured by Google devices.
Google said its contractors review a small percentage of the recordings made by customers through Google Assistant devices such as Google Home and phones using Android software.
Its privacy pages for Google Home state that it "collects data that is meant to make our services faster, smarter, more relevant and more useful to you", adding that it learns over time to provide better and more personalised suggestions and answers.
In response to queries from The Straits Times, the technology giant said: "We partner with language experts around the world to improve speech technology by transcribing a small set of queries - this work is critical to developing technology that powers products like the Google Assistant.
"Language experts only review around 0.2 per cent of all audio snippets, and these snippets are not associated with user accounts as part of the review process."
Google added that it will be taking action against reviewers who leaked the data, and that it is conducting a full review to prevent such leaks from happening again.
According to VRT, most of the snippets it had obtained were cases where Google Assistant users had asked for information or to perform tasks.
However, 153 were snippets that never should have been recorded, and apparently were after the device was activated incorrectly from mishearing its wake word, or activation phrase.
Google product manager for search David Monsees acknowledged this problem in a blog post on Thursday, calling it a "false accept".
He said: "This means that there was some noise or words in the background that our software interpreted to be the hot word (like "okay, Google"). We have a number of protections in place to prevent false accepts from occurring in your home."
The support pages for Google's Home devices deny that these devices are listening in on users.
They say that no information leaves the device until its wake word is detected, but do not mention that the system can mistakenly detect it.
Mr Yeo Siang Tiong, general manager in South-east Asia for cyber-security firm Kaspersky, said that voice assistants generally do not require much power to operate and are able to operate silently in the background.
Mr Joseph Gan, president and co-founder of security solutions firm V-Key, said that voice assistants often mistake other sounds or conversations for wake words, leading to private conversations being recorded.
"Even audio recorded as part of normal voice assistant usage may allow individuals to be identified and reveal sensitive personal or professional information," he said.
It is, however, difficult to avoid using humans to review audio data as systems that use artificial intelligence need to be trained.
Mr Gan said: "Tech companies need to put in place the same kinds of consumer data protections around the audio data collected as they do around other personally identifiable information."