Jack Ma-backed Ant Group touts AI breakthrough built on Chinese chips

Sign up now: Get ST's newsletters delivered to your inbox

Ant is using Nvidia for AI development but is now relying mostly on alternatives like Chinese chips.

Ant is still using Nvidia chips for AI development but is now relying mostly on alternatives including from Advanced Micro Devices and Chinese chips for its latest models.

PHOTO: REUTERS

Follow topic:

BEIJING – Jack Ma-backed Ant Group used Chinese-made semiconductors to develop techniques for training artificial intelligence (AI) models that would cut costs by 20 per cent, according to people familiar with the matter.

Ant used domestic chips, including from affiliate Alibaba Group Holding and Huawei Technologies, to train models using the so-called “mixture of experts”, or MoE, machine learning approach, the people said.

It had results similar to those from Nvidia Corp chips like the H800, they said, asking not to be named as the information is not public.

Ant is still

using Nvidia for AI development

, but is now relying mostly on alternatives including from Advanced Micro Devices and Chinese chips for its latest models, one of the people said.

The models mark Ant’s entry into a race between Chinese and US companies that has accelerated since Hangzhou start-up DeepSeek demonstrated how capable models can be trained for far less than the billions invested by OpenAI and Alphabet’s Google.

It underscores how Chinese companies are trying to use local alternatives to the most advanced Nvidia semiconductors. While not the most advanced, the H800 is a relatively powerful processor and currently barred by the US from China.

The company published a research paper in March that claimed its models at times outperformed Meta Platforms in certain benchmarks, which Bloomberg News has not independently verified.

But if they work as advertised, Ant’s platforms could mark another step forward for Chinese AI development by slashing the cost of inferencing or supporting AI services.

As companies pour significant money into AI, MoE models have emerged as a popular option, gaining recognition for their use by Google and DeepSeek, among others.

That technique divides tasks into smaller sets of data, very much like having a team of specialists who each focus on a segment of a job, making the process more efficient.  Ant declined to comment in an e-mailed statement.

However, the training of MoE models typically relies on high-performing chips like the graphics processing units (GPUs) that Nvidia sells. The cost has to date been prohibitive for many small firms and limited broader adoption.

Ant has been working on ways to train large language models (LLMs) more efficiently and eliminate that constraint. Its paper title makes that clear, as the company sets the goal to scale a model “without premium GPUs”.

That goes against the grain of Nvidia. Chief executive officer Jensen Huang has argued that computation demand will grow even with the advent of more efficient models like DeepSeek’s R1, positing that companies will need better chips to generate more revenue, not cheaper ones to cut costs.

He has stuck to a strategy of building big GPUs with more processing cores, transistors and increased memory capacity.

“Ant Group’s paper highlights the rising innovation and accelerating pace of technological progress in China’s AI sector. The firm’s claim, if confirmed, highlights China is well on the way to becoming self-sufficient in AI as the country turns to lower-cost, computationally efficient models to work around the export controls on Nvidia chips,” said senior Bloomberg Intelligence analyst Robert Lea.

Ant said it cost about 6.35 million yuan (S$1.17 million) to train one trillion tokens using high-performance hardware, but its optimised approach would cut that down to 5.1 million yuan using lower-specification hardware.

Tokens are the units of information that a model ingests in order to learn about the world and deliver useful responses to user queries.

The company plans to leverage the recent breakthrough in the LLMs it has developed, Ling-Plus and Ling-Lite, for

industrial AI solutions including healthcare and finance

, the people said. 

Ant bought Chinese online platform Haodf.com in 2025 to beef up its AI services in healthcare. It also has an AI “life assistant” app called Zhixiaobao and a financial advisory AI service, Maxiaocai.

On English-language understanding, Ant said in its paper that the Ling-Lite model did better in a key benchmark compared with one of Meta’s Llama models. Both Ling-Lite and Ling-Plus models outperformed DeepSeek’s equivalents on Chinese-language benchmarks.

“If you find one point of attack to beat the world’s best gongfu master, you can still say you beat them, which is why real-world application is important,” said Mr Robin Yu, chief technology officer of Beijing-based AI solution provider Shengshang Tech.

Ant has made the Ling models open source. Ling-Lite contains 16.8 billion parameters, which are the adjustable settings that work like knobs and dials to direct the model’s performance.

Ling-Plus has 290 billion parameters, which is considered relatively large in the realm of language models. For comparison, experts estimate that ChatGPT’s GPT-4.5 has 1.8 trillion parameters, according to the MIT Technology Review. DeepSeek-R1 has 671 billion.

Ant faced challenges in some areas of the training, including stability. Even small changes in the hardware or the model’s structure led to problems, including jumps in the models’ error rate, it said in the paper. BLOOMBERG

See more on