Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

R-Bench-T

Benchmarks

Task NameDataset NameSOTA ResultTrend
Code ReasoningR-Bench-T Code
Accuracy49.91
24
Math ReasoningR-Bench-T Math
Accuracy54.48
24
Showing 2 of 2 rows