Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

BRIGHT

Benchmarks

Task NameDataset NameSOTA ResultTrend
Information RetrievalBRIGHT
Mean nDCG@1054.8
94
Passage RerankingBRIGHT
NDCG@10 (Avg)40.3
54
Information RetrievalBRIGHT 1.0 (test)
nDCG@10 (Avg)37.9
35
Long-context retrievalBRIGHT StackExchange
Biology Score62.3
29
Downstream retrievalBRIGHT
Biology nDCG@520
24
Information RetrievalBRIGHT v1 (test)
nDCG@10 (Avg)49.1
22
Reasoning-intensive RetrievalBRIGHT
BRIGHT Score (Biology)33.9
20
RetrievalBRIGHT 12 datasets aggregate (test)
NDCG@1012.74
20
Reasoning-based RetrievalBRIGHT 1.0 (test)
NDCG@10 (Bio.)61.77
16
First-stage retrievalBRIGHT (test)
nDCG@10 (Biology)54.5
13
Information RetrievalBRIGHT static retrieval setting PRO
NDCG@10 (Overall)33.8
13
RetrievalBRIGHT
nDCG@1 (Econ)65.8
13
RetrievalBRIGHT v1 (leaderboard)
Average Retrieval Score46.8
12
Multi-class ClassificationBRIGHT 6class (test)
Accuracy43.4
11
Building Damage AssessmentBRIGHT
F1 (bcd)91.71
10
Information RetrievalBRIGHT unseen 6 subsets (test)
nDCG@1011.79
7
Change DetectionBRIGHT DFC25-T2
mIoU43.75
6
Image ClassificationBRIGHT (test)
Accuracy0.6458
3
Showing 18 of 18 rows