Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Utility on OPI clean
Loading...
90.3
Utility Score
Gemma-4-E4B-it
3.356
25.928
48.5
71.072
May 27, 2026
Utility Score
Updated 6d ago
Evaluation Results
Method
Method
Links
Utility Score
Gemma-4-E4B-it
Setting=Baseline
2026.05
90.3
Gemma-4-E4B-it
Setting=ACT
2026.05
89.3
Gemma-4-E4B-it
Setting=BCT
2026.05
89.1
GPT-OSS-20B
Setting=Baseline
2026.05
88.8
GPT-OSS-20B
Setting=BCT
2026.05
86
Qwen3-8B
Setting=Baseline
2026.05
81.2
Qwen3-8B
Setting=ACT
2026.05
80.4
Qwen3-1.7B
Setting=Baseline
2026.05
80
Qwen3-1.7B
Setting=BCT
2026.05
78.2
Qwen3-8B
Setting=BCT
2026.05
78.2
GPT-OSS-20B
Setting=ACT
2026.05
72.1
Phi-4-reasoning
Setting=Baseline
2026.05
69.6
Qwen3-1.7B
Setting=ACT
2026.05
68.3
Phi-4-reasoning
Setting=ACT
2026.05
66.8
Phi-4-reasoning
Setting=BCT
2026.05
6.7
Feedback
Search any
task
Search any
task