Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Reasoning with Latent Activations on Rail2Country
Loading...
93.7
R2C Mono Accuracy
AR
32.964
48.732
64.5
80.268
Oct 21, 2025
R2C Mono Accuracy
R2C Meta Accuracy
Updated 22d ago
Evaluation Results
Method
Method
Links
R2C Mono Accuracy
R2C Meta Accuracy
AR
Model Backbone=Gemma2 9B
2025.10
93.7
86
DeepSeek-R1-8B
Model Backbone=DeepSee...
2025.10
83.7
60
GPT-4o
2025.10
82.7
69
AR
Model Backbone=Llama3....
2025.10
74.7
62.7
Llama3.1 70B it
Model Backbone=Llama3....
2025.10
68.3
33.3
Gemma2 27B it
Model Backbone=Gemma2...
2025.10
61
45
Llama3.1 8B
Model Backbone=Llama3....
2025.10
41
29.7
Gemma2 9B
Model Backbone=Gemma2 9B
2025.10
35.3
25.7
Feedback
Search any
task
Search any
task