| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Next-action prediction | OPeRA (test) | Action Generation Acc52.92 | 18 | |
| Lung function regression | OPERA (test) | FVC MAE (Breath)0.848 | 7 | |
| Health condition inference | OPERA Obstructive (Lung) | AUROC75.2 | 7 | |
| Health condition inference | OPERA Smoker Cough | AUROC0.83 | 7 | |
| Reasoning and Persona Consistency | OPeRA (test) | Pages per Session5.3 | 7 | |
| Autonomous LLM Agent Verification | OPERA | Mean Td (s)11.9 | 3 | |
| Human-likeness evaluation | OPeRA | Metric- | 0 |