Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Kareus: Joint Reduction of Dynamic and Static Energy in Large Model Training

About

The computing demand of AI is growing at an unprecedented rate, but energy supply is not keeping pace. As a result, energy has become an expensive, contended resource that requires explicit management and optimization. Although recent works have made significant progress in large model training optimization, they focus only on a single aspect of energy consumption: dynamic or static energy. We find that fine-grained kernel scheduling and frequency scaling jointly and interdependently impact both dynamic and static energy consumption. Based on this finding, we design Kareus, a training system that pushes the time--energy tradeoff frontier by optimizing both aspects. Kareus decomposes the intractable joint optimization problem into local, partition-based subproblems. It then uses a multi-pass multi-objective optimization algorithm to find execution schedules that push the time--energy tradeoff frontier. Compared to the state of the art, Kareus reduces training energy by up to 28.3% at the same training time, or reduces training time by up to 27.5% at the same energy consumption.

Ruofan Wu, Jae-Won Chung, Mosharaf Chowdhury• 2026

Related benchmarks

TaskDatasetResultRank
LLM Training OptimizationQwen 3 1.7B
Time Reduction0.149
18
LLM Training OptimizationLlama 3.2 3B
Training Time Reduction (%)12.3
12
Large Language Model Training EfficiencyLlama 1.7B 3.2
Energy Reduction (Iso-Time)28.3
11
LLM TrainingLlama 70B Emulation 3.3
Time Reduction9.3
8
Large-scale model trainingLlama 3.3 70B Emulation (train)
Energy Reduction (Iso-Time)15.3
4
Showing 5 of 5 rows

Other info

Follow for update