Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Mix

Benchmarks

Task NameDataset NameSOTA ResultTrend
Multi-hop Question AnsweringMix
F1 Score79.69
14
Retrieval-Augmented GenerationMix
Comprehensiveness95.9
12
Explanatory QAMix (test)
EM76.5
10
Drivable Area SegmentationMIX Gazebo+GMRPD (test)
Mean IoU98.97
8
Robustness PredictionMIX (Dynamic)
Mean Error0.0006
8
Robustness PredictionMIX (Static)
Mean Error0.0047
8
Federated Graph ClassificationMix across-domain setting
Communication Rounds3
8
RetrievalMix
Recall@30.66
7
Visual Question-AnsweringMix dataset
Accuracy (Mix)64.93
3
Showing 9 of 9 rows