Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Math Reasoning on Math500 (Score)
Loading...
45.4
Score
Qwen3-4B + WeMask(SFT)
38.0472
39.9561
41.865
43.7739
May 8, 2026
Score
Updated 22d ago
Evaluation Results
Method
Method
Links
Score
Qwen3-4B + WeMask(SFT)
Mask Rate=0.3, Trainin...
2026.05
45.4
Qwen3-4B + WeMask(SFT)
Mask Rate=0.1, Trainin...
2026.05
44.47
Qwen3-4B + SFT + WeMask(TF)
Mask Rate=0.3, Trainin...
2026.05
43.7
Qwen3-4B + SFT + WeMask(TF)
Mask Rate=0.1, Trainin...
2026.05
43.6
Qwen3-4B + SFT + WeMask(TF)
Mask Rate=0.5, Trainin...
2026.05
43.6
Qwen3-4B + SFT
Mask Rate=-, Training...
2026.05
43
Qwen3-4B + WeMask(SFT)
Mask Rate=0.7, Trainin...
2026.05
42.93
Qwen3-4B + SFT + WeMask(TF)
Mask Rate=0.7, Trainin...
2026.05
42.87
Qwen3-4B + WeMask(SFT)
Mask Rate=1.0, Trainin...
2026.05
42.73
Qwen3-4B + WeMask(SFT)
Mask Rate=0.5, Trainin...
2026.05
41.87
Qwen3-4B + SFT + WeMask(TF)
Mask Rate=1.0, Trainin...
2026.05
38.33
Feedback
Search any
task
Search any
task