Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

CRUX

Benchmarks

Task NameDataset NameSOTA ResultTrend
Code ReasoningCRUX
Accuracy87.37
23
CodeCRUX
Accuracy66.4
6
Showing 2 of 2 rows