Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Efficiency Test Generation on CodeContests Python
Loading...
60
ASR (Fast)
STAB
25.2328
34.2589
43.285
52.3111
May 27, 2026
ASR (Fast)
ASR (Slow)
ASR (Random)
Updated 6d ago
Evaluation Results
Method
Method
Links
ASR (Fast)
ASR (Slow)
ASR (Random)
STAB
Model=Gemma-4
2026.05
60
79.23
79.84
STAB
Model=Gemini-3.1
2026.05
59.23
75.65
77.84
STAB
Model=Qwen-3.5
2026.05
56.14
74.2
74.56
STAB
Model=GPT-5.4
2026.05
56.04
73.91
74.81
Base
Model=GPT-5.4
2026.05
51.88
68.21
68.05
Base
Model=Gemma-4
2026.05
42.32
65.8
41.77
Base
Model=Gemini-3.1
2026.05
34.11
54.88
54.94
Base
Model=Qwen-3.5
2026.05
26.57
57.1
61.35
Feedback
Search any
task
Search any
task