Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Science Question Answering on ARC Challenging (LLMcritic calls, Reduction %)
Loading...
15.65
LLMcritic Calls
VecCISC + HAC
10.1796
11.5998
13.02
14.4402
May 8, 2026
LLMcritic Calls
Reduction (%)
Updated 22d ago
Evaluation Results
Method
Method
Links
LLMcritic Calls
Reduction (%)
VecCISC + HAC
Budget=20, Model=Mistr...
2026.05
15.65
-21.73
VecCISC + HAC
Budget=20, Model=Llama...
2026.05
14.8
-26
VecCISC + HAC
Budget=20, Model=Llama...
2026.05
13.31
-33.43
VecCISC + HAC
Budget=20, Model=Qwen2...
2026.05
10.94
-45.3
VecCISC + HAC
Budget=20, Model=GPT-4...
2026.05
10.39
-48.07
Feedback
Search any
task
Search any
task