Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

WQ

Benchmarks

Task NameDataset NameSOTA ResultTrend
Question AnsweringWQ (test)
AUROC76.6
90
Question AnsweringWQ
Absolute Execution Time Overhead (s)0.039
90
Question AnsweringWQ
PRR62.8
90
Open-Domain Question AnsweringWQ (test)
EM33.71
37
Reward ModelingWQ Arena
Accuracy65.29
22
Model Compressionwq
Performance45.5
13
Model Compressionwq
Accuracy / R2 Score0.45
13
Regressionwq
R^20.5
13
RF compressionwq
Performance Score49.6
13
Inference EfficiencyWQ
Relative Execution Time Overhead0.014
12
Open-domain retrievalWQ
Recall@2073.2
9
Question AnsweringWQ
Accuracy45.5
8
Showing 12 of 12 rows