For subscribers

Forget DeepSeek. Large language models are getting cheaper still

A US$6 million LLM isn’t cool. A US$6 one is.

Sign up now: Get ST's newsletters delivered to your inbox

FILE PHOTO: FILE PHOTO: The deepSeek logo, a keyboard, and robot hands are seen in this illustration taken January 27, 2025. REUTERS/Dado Ruvic/Illustration/File Photo/File Photo

DeepSeek earned itself headlines for cutting the dollar cost of training a frontier model down from US$61.6 million (S$83 million) to just US$6 million.

PHOTO: REUTERS

The Economist

Follow topic:

As recently as 2022, just building a large language model (LLM) was a feat at the cutting edge of artificial intelligence (AI) engineering. Three years on, experts are harder to impress. To really stand out in the crowded marketplace, an AI lab needs not just to build a high-quality model, but to build it cheaply.

In December

a Chinese firm, DeepSeek

, earned itself headlines for cutting the dollar cost of training a frontier model down from US$61.6 million (S$83 million) – the cost of Llama 3.1, an LLM produced by Meta, a technology company – to just US$6 million. In a preprint posted online in February, researchers at Stanford University and the University of Washington claim to have gone several orders of magnitude better, training their s1 LLM for just US$6. Phrased another way, DeepSeek took 2.7 million hours of computer time to train; s1 took just under seven hours.

See more on