Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

ED

Benchmarks

Task NameDataset NameSOTA ResultTrend
Adversarial AttackED
GL38.82
18
Multi-turn role-playED
Success Rate (SR)92.1
12
Entity Disambiguation6 standard ED
Avg ED F190
6
Empathetic DialogueED
Success Rate (SR)92.1
5
Stance DetectionED Erdoğan Turkish election data (test)
Precision (PRO)99
4
Procedure CodingED CPT
Recall48.8
2
Showing 6 of 6 rows