Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Instruction Following on Instruction Following (test)
Loading...
0.9
NMSE
MENTAT (Detailed Prompt)
0.8756
1.0403
1.205
1.3697
Aug 29, 2025
NMSE
CCC
Updated 1mo ago
Evaluation Results
Method
Method
Links
NMSE
CCC
MENTAT (Detailed Prompt)
LM=GPT-OSS-20B
2025.08
0.9
0.43
MENTAT (Basic Prompt)
LM=GPT-OSS-20B
2025.08
0.95
0.42
MENTAT (Basic Prompt)-Avg
LM=GPT-OSS-20B
2025.08
1.06
0.38
GEPA
LM=GPT-OSS-20B
2025.08
1.06
0.46
Gradient Descent
LM=NeoBERT
2025.08
1.08
0.36
MENTAT (Detailed Prompt)-Avg
LM=GPT-OSS-20B
2025.08
1.09
0.39
Detailed Prompt
LM=GPT-OSS-20B
2025.08
1.16
0.33
Basic Prompt
LM=GPT-OSS-20B
2025.08
1.18
0.32
MENTAT (Detailed Prompt) Prompt
LM=GPT-OSS-20B
2025.08
1.24
0.36
MENTAT (Basic Prompt) Prompt
LM=GPT-OSS-20B
2025.08
1.25
0.35
RL Fine-Tuning
LM=GPT-OSS-20B
2025.08
1.51
0.37
Feedback
Search any
task
Search any
task