Kareus: Joint Reduction of Dynamic and Static Energy in Large Model Training

About

The computing demand of AI is growing at an unprecedented rate, but energy supply is not keeping pace. As a result, energy has become an expensive, contended resource that requires explicit management and optimization. Although recent works have made significant progress in large model training optimization, they focus only on a single aspect of energy consumption: dynamic or static energy. We find that fine-grained kernel scheduling and frequency scaling jointly and interdependently impact both dynamic and static energy consumption. Based on this finding, we design Kareus, a training system that pushes the time--energy tradeoff frontier by optimizing both aspects. Kareus decomposes the intractable joint optimization problem into local, partition-based subproblems. It then uses a multi-pass multi-objective optimization algorithm to find execution schedules that push the time--energy tradeoff frontier. Compared to the state of the art, Kareus reduces training energy by up to 28.3% at the same training time, or reduces training time by up to 27.5% at the same energy consumption.

Ruofan Wu, Jae-Won Chung, Mosharaf Chowdhury• 2026

Related benchmarks

Task	Dataset	Result
LLM Training Optimization	Qwen 3 1.7B	Time Reduction0.149	18
LLM Training Optimization	Llama 3.2 3B	Training Time Reduction (%)12.3	12
Large Language Model Training Efficiency	Llama 1.7B 3.2	Energy Reduction (Iso-Time)28.3	11
LLM Training	Llama 70B Emulation 3.3	Time Reduction9.3	8
Large-scale model training	Llama 3.3 70B Emulation (train)	Energy Reduction (Iso-Time)15.3	4

Showing 5 of 5 rows

Other info

Follow for update

@wizwand_team Discord