Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

FDA

Benchmarks

Task NameDataset NameSOTA ResultTrend
Long-context language modeling evaluationFDA (test)
Score0.8004
120
Information ExtractionFDA
Accuracy84.5
22
Dynamic Multi-objective OptimizationFDA 2
Maximum Hypervolume (MHV)2
15
In-context retrievalFDA
Accuracy74.5
13
Knowledge-style RetrievalFDA 2048 tokens
Accuracy62
8
Information Extraction and RetrievalFDA
Accuracy2.72
5
Showing 6 of 6 rows