Artificial intelligence (AI) has always depended on access to data.
Today’s large language models (LLMs) are trained on, essentially, the entire Internet. Much of that is public domain material outside the realm of copyright. It also includes pirated works that should not be there and material that was shared to be read but not copied.
Already a subscriber? Log in
Read the full story and more at $9.90/month
Get exclusive reports and insights with more than 500 subscriber-only articles every month
ST One Digital
$9.90/month
No contract
ST app access on 1 mobile device
Unlock these benefits
All subscriber-only content on ST app and straitstimes.com
Easy access any time via ST app on 1 mobile device
E-paper with 2-week archive so you won't miss out on content that matters to you