Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

OPeRA

Benchmarks

Task NameDataset NameSOTA ResultTrend
Next-action predictionOPeRA (test)
Action Generation Acc52.92
31
Lung function regressionOPERA (test)
FVC MAE (Breath)0.848
7
Health condition inferenceOPERA Obstructive (Lung)
AUROC75.2
7
Health condition inferenceOPERA Smoker Cough
AUROC0.83
7
Reasoning and Persona ConsistencyOPeRA (test)
Pages per Session5.3
7
Human-likeness evaluationOPeRA
Human-likeness Score4.21
5
Autonomous LLM Agent VerificationOPERA
Mean Td (s)11.9
3
Showing 7 of 7 rows