Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

CORe

Benchmarks

Task NameDataset NameSOTA ResultTrend
Downstream Performance EvaluationCORE
CORE Score19.94
53
Domain-Incremental LearningCORe50
Avg Accuracy (A)99.5
49
Few-Shot Class-Incremental LearningCORe50
BCR99.9
39
Cross-modal geo-localizationCORE Intercontinental-level Subset4 1.0
R@151.35
15
Cross-modal geo-localizationCORE Intercontinental-level Subset3 1.0
R@149.81
15
Cross-modal geo-localizationCORE Intercontinental-level Subset2 1.0
R@164.74
15
Cross-modal geo-localizationCORE Intercontinental-level Subset1 1.0
R@157.9
15
Cross-modal geo-localizationCORE World-level 1.0 (All)
R@155.84
15
ReasoningCORE-Ext
Accuracy15.9
10
ReasoningCORE
Accuracy26.83
10
Language ModelingCore-Extended
Score17.08
8
Language ModelingCore
Score28.44
8
Relation ClassificationCORE
F1-Mic80
8
General Language UnderstandingCORE
CORE Score26.32
4
Comprehensive Optimization and Reasoning EvaluationCORE
CORE Score25.14
4
Control-Dependency / Trace extractionCoRe Lite Control-Dependency Trace subtask n=489
F1 Score94.58
3
Latent centroid displacement predictionCore Level 5 approximation
Observed Displacement6.39
3
Showing 17 of 17 rows