Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

mixed

Benchmarks

Task NameDataset NameSOTA ResultTrend
Contact EstimationMixed (test)
Accuracy95.3
50
Tool RetrievalMixed
NDCG@100.59
44
General Problem SolvingMixed (AIME, GPQA, HLE, HotpotQA, ALFWorld)
Average Score57.74
24
Grammatical UnderstandingMIXED (test)
Task 1 Accuracy95
13
Machine TranslationMIXED #3000 English to Luxembourgish (test)
CometScore0.26
13
Named Entity RecognitionMixed
F1 Score45.55
12
Spoofing Method IdentificationMixed In Domain
Accuracy96.44
11
Authenticity ClassificationMixed In Domain
Accuracy96.16
11
Speech Anti-SpoofingMixed In-Domain
EER0.0306
11
Spoofing Region LocalizationMixed In Domain
Seg-F191.33
9
Frame-level Deepfake DetectionMixed dataset
Accuracy98.3
6
4D Mesh CompressionMixed
Time (ms)6.89
5
Text ClassificationMixed BERT-base (test)
Accuracy83.3
5
Reflection symmetry detectionmixed SDRW LDRS NYU (test)
F1 Score71.4
2
Handwritten Character RecognitionMixed (train)
Accuracy95.82
1
Handwritten Character RecognitionMixed (val)
Accuracy95.2
1
Handwritten Character RecognitionMixed (test)
Accuracy95.2
1
Showing 17 of 17 rows