For subscribers

Is ChatGPT just a copycat?

As ChatGPT turns one year old, there are growing questions about the way it draws upon creative works to compete with the authors of those very same works.

Sign up now: Get ST's newsletters delivered to your inbox

A photo taken on October 4, 2023 in Manta, near Turin, shows a smartphone and a laptop displaying the logos of the artificial intelligence OpenAI research laboratory and ChatGPT robot. (Photo by MARCO BERTORELLO / AFP)

A key question is whether using data to train models, which then produce works that may compete with the creators of those data, constitutes fair use.

PHOTO: AFP

Simon Chesterman

Follow topic:

Artificial intelligence (AI) has always depended on access to data.

Today’s large language models (LLMs) are trained on, essentially, the entire Internet. Much of that is public domain material outside the realm of copyright. It also includes pirated works that should not be there and material that was shared to be read but not copied.

See more on