Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Time-Dialog

Benchmarks

Task NameDataset NameSOTA ResultTrend
Temporal ReasoningTime-Dialog (test)
Location Accuracy88.9
13
Temporal ReasoningTime-Dialog Out-of-Domain (test)
F1 Score40.2
6
Showing 2 of 2 rows