Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

TEMPREASON

Benchmarks

Task NameDataset NameSOTA ResultTrend
Temporal ReasoningTempReason-L3 in-domain (test)
EM0.851
20
Temporal ReasoningTempReason-L2 in-domain (test)
EM80.8
20
Event-Event Temporal Reasoning (L3)TEMPREASON 1.0 (test)
EM8,110
12
Time-Event Temporal Reasoning (L2)TEMPREASON 1.0 (test)
EM8,480
12
Time-Time Temporal Reasoning (L1)TEMPREASON 1.0 (test)
EM100
4
Showing 5 of 5 rows