Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Open-domain Reasoning on ARC-c
Loading...
84.6
Pass@1
TRAPO
15.544
33.472
51.4
69.328
Dec 15, 2025
Pass@1
Updated 1mo ago
Evaluation Results
Method
Method
Links
Pass@1
TRAPO
Training Paradigm=Semi...
2025.12
84.6
TRAPO
Training Paradigm=Semi...
2025.12
83.7
Fully Supervised
Training Paradigm=Supe...
2025.12
82.3
Fully Supervised
Training Paradigm=Supe...
2025.12
82.1
TTRL
Training Paradigm=Unsu...
2025.12
80.5
Sentence-level Entropy
Training Paradigm=Unsu...
2025.12
79.4
Sentence-level Entropy
Training Paradigm=Semi...
2025.12
79.4
Fully Supervised
Training Paradigm=Supe...
2025.12
76.2
Token-level Entropy
Training Paradigm=Unsu...
2025.12
75.6
Self-certainty
Training Paradigm=Unsu...
2025.12
72.9
Token-level Entropy
Training Paradigm=Semi...
2025.12
72.9
TTRL
Training Paradigm=Semi...
2025.12
72.6
Qwen-Instruct
Training Paradigm=Orig...
2025.12
70.3
Self-certainty
Training Paradigm=Semi...
2025.12
64.8
Qwen-Base
Training Paradigm=Orig...
2025.12
18.2
Feedback
Search any
task
Search any
task