Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Question-Type Diversity Alignment on Legalbench Taxonomy
Loading...
0.036
Jensen Shannon Divergence
gpt-oss-120b
0.0322
0.05785
0.0835
0.10915
Mar 5, 2026
Jensen Shannon Divergence
Updated 1mo ago
Evaluation Results
Method
Method
Links
Jensen Shannon Divergence
gpt-oss-120b
Simulator Variant=AGENT
2026.03
0.036
gpt-oss-120b
Simulator Variant=SCOT...
2026.03
0.061
gpt4o
Simulator Variant=SCOT...
2026.03
0.069
gemini-2.5-pro
Simulator Variant=AGENT
2026.03
0.072
gemini-2.5-pro
Simulator Variant=SCOT...
2026.03
0.087
Llama-3.3-70B-Instruct
Simulator Variant=SCOT...
2026.03
0.122
gpt4o
Simulator Variant=AGENT
2026.03
0.124
Qwen3-32B
Simulator Variant=SCOT...
2026.03
0.131
Feedback
Search any
task
Search any
task