Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Motion Policy Classification on ParlVote+
Loading...
0.47
Macro F1
Mistral-7B-Instruct-v0.3
-0.0188
0.1081
0.235
0.3619
Aug 5, 2025
Macro F1
Updated 4d ago
Evaluation Results
Method
Method
Links
Macro F1
Mistral-7B-Instruct-v0.3
Evaluation Protocol=fi...
2025.08
0.47
Llama-3.1-8B-Instruct
Evaluation Protocol=fi...
2025.08
0.35
gemma-3-4b-it
Evaluation Protocol=fi...
2025.08
0.26
Llama-3.1-8B-Instruct
Evaluation Protocol=3-...
2025.08
0.08
gemma-3-4b-it
Evaluation Protocol=3-...
2025.08
0.07
Mistral-7B-Instruct-v0.3
Evaluation Protocol=3-...
2025.08
0.05
gpt-4.1-nano
Evaluation Protocol=3-...
2025.08
0.03
gemma-3-4b-it
Evaluation Protocol=on...
2025.08
0.02
Llama-3.1-8B-Instruct
Evaluation Protocol=on...
2025.08
0.02
Mistral-7B-Instruct-v0.3
Evaluation Protocol=on...
2025.08
0.01
gpt-4.1-nano
Evaluation Protocol=on...
2025.08
0
Feedback
Search any
task
Search any
task