Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

AddSub

Benchmarks

Task NameDataset NameSOTA ResultTrend
Arithmetic ReasoningAddSub
Accuracy99
123
Mathematical ReasoningADDSUB
Solve Rate93.1
25
Arithmetic ReasoningAddSub (test)
Accuracy96.71
8
Mathematical ReasoningAddSub
Reference Accuracy87.9
4
Online Out-of-Distribution DetectionAddSub Near-shift OOD
Accuracy79.16
3
Showing 5 of 5 rows