Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Instruction-following clustering on SP500 (C2)
Loading...
69.34
V-measure
Gemini 2.5 Pro
29.6224
39.9337
50.245
60.5563
Mar 6, 2026
V-measure
Updated 23d ago
Evaluation Results
Method
Method
Links
V-measure
Gemini 2.5 Pro
Model Type=Reasoning M...
2026.03
69.34
o3
Model Type=Reasoning M...
2026.03
66.51
QwQ-32B
Model Type=Reasoning M...
2026.03
63.23
C1-Qwen-7B
Model Type=Our Model
2026.03
60.28
C1-Qwen-14B
Model Type=Our Model
2026.03
59.44
GPT-oss-120B
Model Type=Reasoning M...
2026.03
51.84
Distill-Llama-70B
Model Type=Reasoning M...
2026.03
42.13
GPT-4.1
Model Type=General Model
2026.03
41.27
DeepSeek-R1
Model Type=Reasoning M...
2026.03
39.69
Distill-Qwen-32B
Model Type=Reasoning M...
2026.03
38.74
Qwen2.5-72B-Instruct
Model Type=General Model
2026.03
33.28
Llama-3.1-70B-Instruct
Model Type=General Model
2026.03
32.28
GPT-4o
Model Type=General Model
2026.03
31.15
Feedback
Search any
task
Search any
task