Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

CoT-Collection

Benchmarks

Task NameDataset NameSOTA ResultTrend
ReasoningCoT-Collection Scenario 1
Accuracy70
40
Chain-of-Thought ReasoningCoT-Collection
Composite Score73.7
20
Showing 2 of 2 rows