Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Causal Variable Identification on Arithmetic
Loading...
88.2
F1 (X)
GPT-5
75.2
78.575
81.95
85.325
May 17, 2025
F1 (X)
F1 (Z)
F1 (M)
F1 (Y)
Updated 1mo ago
Evaluation Results
Method
Method
Links
F1 (X)
F1 (Z)
F1 (M)
F1 (Y)
GPT-5
Setting=Decompositiona...
2025.05
88.2
86.5
82.1
85.6
GPT-o4
Setting=Decompositiona...
2025.05
84.9
82.8
82.3
73.4
Llama4-M
Setting=Decompositiona...
2025.05
81.6
79.8
80.3
86
Llama4-S
Setting=Decompositiona...
2025.05
80.3
78.6
79.1
85.1
DeepSeek
Setting=Decompositiona...
2025.05
78.3
76.1
65.7
74.9
Gemini2.5
Setting=Decompositiona...
2025.05
76.9
74.5
62.8
73.5
Qwen3
Setting=Decompositiona...
2025.05
75.7
73.8
63.6
71.9
Feedback
Search any
task
Search any
task