Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Question-Type Diversity Alignment on Stetson Taxonomy
Loading...
0.095
Jensen Shannon Divergence (JSD)
gemini-2.5-pro
0.09124
0.11662
0.142
0.16738
Mar 5, 2026
Jensen Shannon Divergence (JSD)
Updated 1mo ago
Evaluation Results
Method
Method
Links
Jensen Shannon Divergence (JSD)
gemini-2.5-pro
Simulator Variant=AGENT
2026.03
0.095
gemini-2.5-pro
Simulator Variant=SCOT...
2026.03
0.109
Llama-3.3-70B-Instruct
Simulator Variant=SCOT...
2026.03
0.134
gpt4o
Simulator Variant=SCOT...
2026.03
0.14
Qwen3-32B
Simulator Variant=SCOT...
2026.03
0.153
gpt-oss-120b
Simulator Variant=SCOT...
2026.03
0.171
gpt4o
Simulator Variant=AGENT
2026.03
0.179
gpt-oss-120b
Simulator Variant=AGENT
2026.03
0.189
Feedback
Search any
task
Search any
task