Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

CRAG

Benchmarks

Task NameDataset NameSOTA ResultTrend
Error DetectionCRAG multi-hop subset (train)
Precision92
36
Error DetectionCRAG
F1 Score91
36
Gland SegmentationCRAG
F1 Score87.4
19
Multimodal Retrieval-Augmented GenerationCRAG-MM (Overall)
Truthfulness20.5
18
Question AnsweringCRAG
Finance Score20.1
12
Nuclei instance segmentationCRAG Dpath (test)
Dice0.785
8
Gland SegmentationCRAG (test)
F1 Score86.9
7
Question AnsweringCRAG (test)
P@163.3
6
Retrieval-Augmented GenerationCRAG
Finance Accuracy16.4
5
Showing 9 of 9 rows