Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

MCQ

Benchmarks

Task NameDataset NameSOTA ResultTrend
Distractor GenerationMCQ (test)
P@122.39
17
DetectionMCQ
Detection Score71.6
5
PreventionMCQ
gpt-5.1 Score99.3
5
Distractor GenerationMCQ dataset
Relevance4.45
5
Showing 4 of 4 rows