Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

RealHitBench

Benchmarks

Task NameDataset NameSOTA ResultTrend
Structure ComprehendingRealHitBench
Exact Match (EM)82.71
94
Fact CheckingRealHitBench
Exact Match70.91
94
Numerical ReasoningRealHitBench
Exact Match (EM)70.31
66
Chart GenerationRealHitBench
ECR100
60
Data AnalysisRealHitBench
GPT Score79.55
60
Showing 5 of 5 rows