MIT study suggests computing power – not ‘secret sauce’ – drives most AI model breakthroughs

A brand new research by researchers on the Massachusetts Institute of Technology means that the speedy enchancment of enormous language fashions is pushed primarily by entry to huge computing energy relatively than secret proprietary methods developed by particular person AI corporations.

The paper, titled Is there “Secret Sauce” in Large Language Model Development?, analyzes 809 massive language fashions launched between October 2022 and March 2025 to grasp what elements are accountable for enhancements in AI capabilities.

The researchers examined benchmark efficiency and coaching knowledge for the fashions and tried to separate the sources of progress into 4 parts: the quantity of coaching compute used, shared algorithmic progress throughout the trade, developer-specific methods, and model-specific design selections.

The research begins from the statement that AI methods have improved at a rare tempo lately.

“Massive language fashions (LLMs) have skilled a interval of speedy progress and benchmark scores have climbed at an astonishing fee,” the authors write.

One of many central questions behind that progress is whether or not main AI corporations possess a technological “secret sauce” that provides them a sustained benefit.

The evaluation discovered some proof of company-specific benefits, however their total impression seems restricted when inspecting probably the most superior fashions. In line with the researchers, “14-18 p.c of LLM efficiency variations are defined by company-specific results,” indicating that proprietary engineering methods do contribute to efficiency enhancements.

Nonetheless, the research concludes that the dominant issue behind frontier-level AI efficiency is just scale – the large computing sources used to coach the biggest fashions.

At the forefront of AI growth, the researchers estimate that “80-90 p.c of frontier mannequin efficiency is a consequence of those fashions’ massive and rising compute.”

The outcomes recommend that whereas engineering enhancements and algorithmic improvements do matter, they’re overshadowed by the dramatic improve in computing energy dedicated to coaching trendy fashions.

Over the interval studied, the coaching compute used for probably the most highly effective fashions grew by roughly an element of 5,000, far exceeding the good points produced by different elements.

On the identical time, the analysis highlights that effectivity enhancements are nonetheless vital for fashions outdoors the frontier. Shared algorithmic progress throughout the trade improved efficient compute effectivity by roughly 7.5 instances, permitting builders to attain comparable benchmark efficiency with far much less coaching compute than earlier fashions required.

In some circumstances, variations between builders had been much more pronounced. The research discovered that amongst smaller fashions, sure builders had been as much as 61 instances extra compute-efficient than others when reaching comparable efficiency ranges.

Taken collectively, the findings recommend that the worldwide race to construct probably the most superior AI methods could in the end rely much less on hidden technological breakthroughs and extra on entry to large-scale computing infrastructure.

If that pattern continues, the researchers observe, the provision of superior chips and data-center capability may change into the decisive issue shaping the way forward for synthetic intelligence.

Related Posts

Siemens introduces Digital Twin Composer for large-scale digital twin environments

PickNik Robotics to work with Motiv Space Systems on NASA ISAM mission

ZaiNar raises $100M and launches physical AI platform