Artificial Intelligence

New T2 Scaling Laws: Optimize AI with Smaller, Data-Rich Models

Researchers from the University of Wisconsin-Madison and Stanford University have introduced a groundbreaking framework. The "Train-to-Test" (T2) scaling laws revolutionize large language model (LLM) optimization.

person Redacción Tricuatro calendar_month 17 April, 2026 schedule 2 min read Add us on

New T2 Scaling Laws: Optimize AI with Smaller, Data-Rich Models

Researchers from the University of Wisconsin-Madison and Stanford University have introduced the "Train-to-Test" (T2) scaling laws. This new framework optimizes the computational budget for artificial intelligence. It allows developers to maximize the performance of large language models (LLMs) by considering inference costs, not just training expenses. This innovation is crucial for real-world applications seeking both efficiency and accuracy.

Until now, standard guidelines for building LLMs primarily optimized for training costs. This presented a significant challenge for practical applications. Many of these applications use inference-time scaling techniques, such as drawing multiple reasoning samples, to increase the accuracy of model responses.

The T2 scaling laws bridge this gap by jointly optimizing three crucial factors. They consider a model's parameter size, its training data volume, and the number of test-time inference samples. This comprehensive approach redefines how we think about AI efficiency.

It is compute-optimal to train substantially smaller models on vastly more data.

In practice, the research demonstrates a surprising and highly valuable finding. It is compute-optimal to train substantially smaller models on vastly more data than traditional rules prescribe. The saved computational overhead is then used to generate multiple repeated samples at inference. This changes the game for efficiency.

For enterprise AI application developers who are training their own models, this research provides a proven blueprint. It helps them maximize their return on investment. It shows that AI reasoning does not necessarily require spending huge amounts on frontier models.

Instead, smaller models can yield stronger performance on complex tasks. They also keep per-query inference costs manageable within real-world deployment budgets. This democratizes access to powerful, efficient AI.

Scaling laws are an important part of developing large language models. Pretraining scaling laws dictate the best way to allocate compute during a model's creation. Test-time scaling laws, on the other hand, guide how to allocate compute during deployment. This includes letting the model "think longer" or generating multiple reasoning samples to solve complex problems.

The problem is that these scaling laws have been developed completely independently. Yet, they are fundamentally intertwined. A model's parameter size and training duration directly dictate both the quality and the per-query cost of its inference samples.

Currently, the industry gold standard for pretraining is the Chinchilla rule. This suggests a compute-optimal ratio of roughly 20 training tokens for every model parameter. However, creators of modern AI model families, such as Llama, Gemma, and Qwen, regularly break this rule. They intentionally overtrain their smaller models on massive amounts of data.

Nicholas Roberts, co-author of the paper, told VentureBeat that the traditional approach falters when building complex agentic workflows. "In my view, the inference stack breaks down when each individual inference call is expensive," he stated. "This is the case when the models are large and you need to do a lot of repeated sampling." Instead of relying on massive models, developers can use overtrained compact models. This allows them to run this repeated sampling at a fraction of the cost. It's a brilliant strategy!

Because training and test-time scaling laws were examined in isolation, there was no rigorous framework. This framework would calculate how much a model should be overtrained. The amount would depend on how many reasoning samples it will need to generate during deployment. The T2 laws finally solve this crucial unknown.

Article topics

Inteligencia Artificial

Also available in: ES

US Government Suspends Access to Anthropic's Fable 5 and Mythos 5 AI Models

A national security directive forces Anthropic to disable its advanced AI models, Fable 5 and Mythos 5, for all customers starting June 12, 2026.

schedule 4 min read

Google Launches Gemini 3.5 Live Translate for Instant Voice Translation

Google expands real-time translation availability with Gemini 3.5 Live Translate, offering lower latency and support for over 70 languages.

schedule 2 min read

Microsoft Unveils Seven New AI Models with Human-Centric Superintelligence Vision

Microsoft introduces a powerful family of seven AI models spanning image, voice, and code, designed to empower developers and organizations with a strong ethical foundation.

schedule 5 min read

Latest news

View all

Elon Musk Plans Space Data Centers to Ease AI Energy Crisis

The escalating energy demand for artificial intelligence and terrestrial data centers is driving SpaceX to explore space-based solutions, with Starship being key to economic viability.

schedule 3 min read

Windows 95 Runs on Texas Instruments Graphing Calculator

A modder successfully ported Windows 95 to a TI-Nspire CX II, showcasing modern calculator hardware capabilities.

schedule 2 min read

Humanoid Robot Pemba Climbs Chimborazo, Sets Sights on Everest Summit

The humanoid robot Pemba, an adaptation of the Unitree G1, recently ascended over 6,000 meters on Chimborazo and now aims to conquer Mount Everest. This technological feat paves the way for new forms of exploration in extreme environments.

schedule 4 min read

Comments (0)

No comments yet. Be the first!

New T2 Scaling Laws: Optimize AI with Smaller, Data-Rich Models

Article topics

Related articles

US Government Suspends Access to Anthropic's Fable 5 and Mythos 5 AI Models

Google Launches Gemini 3.5 Live Translate for Instant Voice Translation

Microsoft Unveils Seven New AI Models with Human-Centric Superintelligence Vision

Latest news

Elon Musk Plans Space Data Centers to Ease AI Energy Crisis

Windows 95 Runs on Texas Instruments Graphing Calculator

Humanoid Robot Pemba Climbs Chimborazo, Sets Sights on Everest Summit

Comments (0)

Leave a comment

Article topics

Enjoyed this article?

Related articles

US Government Suspends Access to Anthropic's Fable 5 and Mythos 5 AI Models

Google Launches Gemini 3.5 Live Translate for Instant Voice Translation

Microsoft Unveils Seven New AI Models with Human-Centric Superintelligence Vision

Latest news

Elon Musk Plans Space Data Centers to Ease AI Energy Crisis

Windows 95 Runs on Texas Instruments Graphing Calculator

Humanoid Robot Pemba Climbs Chimborazo, Sets Sights on Everest Summit

Comments (0)

Leave a comment