Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

MRCR

Benchmarks

Task NameDataset NameSOTA ResultTrend
Long ContextMRCR
Score89.7
25
Long Context UnderstandingMRCR
Accuracy75.3
15
Multi-needle RetrievalMRCR
MRCR 2-Needle Success38
9
Multi-round coreference resolutionMRCR
2-Needle Score81.6
8
Complex retrieval and positional sortingMRCR 512K~1M
Score46.88
6
Complex retrieval and positional sortingMRCR 128K~512K
Score53.98
6
Multi-Round Chat RetrievalMRCR
Accuracy92.3
6
Multi-round Code RepairMRCR
pass@10.77
5
Showing 8 of 8 rows