Article

Ant Group debuts Ling-1T, a trillion-parameter open-sourced model that scores 70.42% on AIME and matches top reasoning efficiency

DATE: 10/16/2025 · STATUS: LIVE

Ant Group’s Ling-1T shocks AI watchers with trillion-parameter scale and AIME prowess, promising surprises and a reveal that changes forever…

Ant Group debuts Ling-1T, a trillion-parameter open-sourced model that scores 70.42% on AIME and matches top reasoning efficiency

Article content

Ant Group has moved into the trillion-parameter AI model space with Ling-1T, an open-source language model the Chinese fintech company presents as a breakthrough that balances computational efficiency and advanced reasoning. The October 9 announcement marks a notable step for the Alipay operator as it expands artificial intelligence work across multiple model architectures.

The trillion-parameter system shows competitive results on difficult math problems, recording 70.42% accuracy on the 2025 American Invitational Mathematics Examination (AIME) benchmark. That exam is commonly used to measure an AI system’s problem-solving skill on challenging, timed questions.

Ant Group’s technical documentation says Ling-1T holds that level of performance while producing an average of more than 4,000 output tokens per problem, a figure the company cites when placing the model alongside what it calls “best-in-class AI models” for output quality.

The Ling-1T release arrived together with dInfer, a specialised inference framework designed for diffusion language models. Ant Group framed the two launches as complementary efforts, reflecting a strategy that explores multiple technological directions rather than a single dominant architecture.

Diffusion language models differ in architecture from the autoregressive systems that power popular chatbots such as ChatGPT. Autoregressive models generate text in a sequence, one token at a time. Diffusion approaches create output in parallel, a method common in image and video generators but less often applied to text processing.

Ant Group provided benchmarking that highlights potential runtime advantages for diffusion-based inference. On the HumanEval coding test, the company’s LLaDA-MoE diffusion model scored roughly 1,011 tokens per second under dInfer. By comparison, Nvidia’s Fast-dLLM framework ran at about 91 tokens per second in the same setting, and Alibaba’s Qwen-2.5-3B model achieved 294 tokens per second when running on vLLM infrastructure.

“We believe that dInfer provides both a practical toolkit and a standardised platform to accelerate research and development in the rapidly growing field of dLLMs,” researchers at Ant Group wrote in accompanying technical material.

Ling-1T is one piece of a broader suite of models the company has assembled over recent months. Ant Group’s lineup now includes three main series: Ling models aimed at standard language tasks, Ring models intended for complex reasoning (the firm previously published a Ring-1T-preview), and Ming multimodal systems that handle images, text, audio, and video.

The firm is also experimenting with LLaDA-MoE, an implementation that uses Mixture-of-Experts (MoE) architecture. MoE architectures activate only a subset of a model’s parameters for any given input, a design intended to improve compute efficiency for large-scale systems.

He Zhengyu, chief technology officer at Ant Group, spoke about the company’s broader aims. “At Ant Group, we believe Artificial General Intelligence (AGI) should be a public good—a shared milestone for humanity’s intelligent future,” He stated, adding that the open-source releases of both the trillion-parameter AI model and Ring-1T-preview represent steps toward “open and collaborative advancement.”

The timing of these launches reflects wider strategic calculations inside China’s AI sector. Restrictions on access to the most advanced semiconductors have pushed several big Chinese tech firms to prioritise innovations in algorithms and software optimisation as ways to remain competitive.

Outside Ant Group, similar experiments with diffusion language architectures have appeared. ByteDance released Seed Diffusion Preview in July and reported up to five-fold speed improvements versus comparable autoregressive models. Those announcements indicate growing interest among major developers in alternative model paradigms that may deliver efficiency gains.

The practical path for diffusion language models still faces questions. Autoregressive systems continue to dominate commercial deployments, offering proven performance in natural language understanding and generation—the core capabilities most customer-facing applications demand.

By releasing Ling-1T with dInfer as open-source projects, Ant Group is positioning itself in favor of a collaborative development model that contrasts with more closed efforts from some rivals. The approach has the potential to accelerate community-driven improvements and to shape components of shared infrastructure used by other researchers and developers.

Ant Group is also building AWorld, a framework meant to support continual learning for autonomous AI agents—software agents designed to carry out tasks on behalf of users with minimal human intervention.

The ability of these combined moves to make Ant Group a major global AI player will hinge on real-world validation of the claimed performance figures and on adoption rates among developers searching for alternatives to dominant platforms.

The open-source licensing of the trillion-parameter model may make it easier to verify benchmarks and to grow a user base that contributes code, evaluations, and tooling. For now, the releases show that large Chinese technology companies view the current AI environment as flexible enough to welcome new entrants that pursue multiple engineering and research avenues at once.

Keep building

Join Skool — Ship Your First Microapp Back to feed