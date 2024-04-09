A few weeks ago, the chief technology officer of OpenAI was asked if her company had used YouTube videos to train its artificial intelligence (AI) systems. First, she gave a blank stare. Then there was a grimace. Finally, Ms Mira Murati gave an answer that avoided the messy and furtive world she and other tech companies were operating in: “Actually, I’m not sure about that.”

According to a New York Times report, OpenAI in fact had trained its AI on “more than one million hours of YouTube videos” using a speech recognition tool called Whisper. All the conversational text from the transcriptions was used to train GPT-4, the flagship large language model that underpins ChatGPT.