Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Temporal Reasoning on TimeQA Easy 1.0 (test)

93.7EM

GPT-4o-mini

9.77231.56153.3575.139May 23, 2023Oct 25, 2023Mar 28, 2024Aug 30, 2024Feb 1, 2025Jul 6, 2025Dec 8, 2025
Updated 1mo ago

Evaluation Results

MethodLinks
2025.12
93.796.4
2025.12
91.194.5
2025.12
9094.3
2025.12
89.594.2
2025.12
88.893.4
2025.12
86.892.6
2025.12
86.791.9
2025.12
85.190.2
2025.12
79.887.9
2025.12
78.187.5
2025.12
73.481.7
2025.12
66.469.1
2025.12
63.772.3
2025.12
63.171.6
2025.12
6373.8
2023.05
60.567.9
2025.12
60.567.9
2025.12
51.271.6
2023.05
48.258.3
2025.12
46.354.4
2025.12
4555.1
2023.05
34.341.5
2023.05
24.634.2
2025.12
1313