Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Cultural commonsense reasoning on CultureAtlas Mid Resource
Loading...
94.9
Precision
GPT-4
62.868
71.184
79.5
87.816
Jan 7, 2026
Precision
Recall
F1 Score
Updated 4d ago
Evaluation Results
Method
Method
Links
Precision
Recall
F1 Score
GPT-4
2026.01
94.9
92.1
93.5
CALM
2026.01
92.5
90.3
91.2
LLaMA-2
Parameters=7B
2026.01
83.3
42.9
56.6
Vicuna
Parameters=7B
2026.01
79.4
57.9
67
Vicuna
Parameters=13B
2026.01
69.4
82.4
75.3
LLaMA-2
Parameters=13B
2026.01
64.1
75.5
69.3
Feedback
Search any
task
Search any
task