Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Training Efficiency on Mixtral-8x22b-G8T8 Fine-grained
Loading...
28.8
MFU
MCore w/ Folding
1.136
8.318
15.5
22.682
Apr 21, 2025
MFU
Updated 1mo ago
Evaluation Results
Method
Method
Links
MFU
MCore w/ Folding
GPUs=128, Global batch...
2025.04
28.8
MCore
GPUs=128, Global batch...
2025.04
17.1
FSDP + EP
GPUs=128, Global batch...
2025.04
9
TP+EP+DP
GPUs=128, Global batch...
2025.04
8.7
FSDP
GPUs=128, Global batch...
2025.04
2.2
Feedback
Search any
task
Search any
task