Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Date

Benchmarks

Task NameDataset NameSOTA ResultTrend
Hybrid ReasoningDate (test)
Accuracy84.5
24
ReasoningDate
Accuracy on Date85.75
24
Symbolic ReasoningDate
Solve Rate90.5
14
Commonsense ReasoningDate (test)
Accuracy73.5
4
Showing 4 of 4 rows