Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Role-playing agent evaluation on LLM Court 5 legal scenarios 1.0 (test)
Loading...
92.5
QS d BRF Score
Llama-3.1-8B
72.22
77.485
82.75
88.015
Apr 13, 2026
QS d BRF Score
Retry Rate
Updated 1mo ago
Evaluation Results
Method
Method
Links
QS d BRF Score
Retry Rate
Llama-3.1-8B
Quantization=4-bit
2026.04
92.5
3
Gemma-2-9B-it
Quantization=4-bit
2026.04
89.3
0
Qwen-3-8B
Quantization=4-bit
2026.04
89.3
1
Gemma-3-12B-it
Quantization=4-bit
2026.04
82.3
0
Hermes-3-8B
Quantization=5-bit
2026.04
81
3
Phi-4-14B
Quantization=4-bit
2026.04
76
0
Qwen-3-14B
Quantization=4-bit
2026.04
73
0
Feedback
Search any
task
Search any
task