Share your thoughts, 1 month free Claude Pro on usSee more

Training Efficiency on Qwen2-57B-A14B Fine-grained

39MFU

MCore w/ Folding

Updated 4mo ago

Evaluation Results

Method	Links
MCore w/ Folding 2025.04		39
MCore 2025.04		35.3
FSDP + EP 2025.04		25.4
TP+EP+DP 2025.04		23.1
FSDP 2025.04		9.9