Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Role-Playing on RPGBench Dialogue Shift (Generalization)

-0.956Turn Composition

RFT

-0.99248-0.74624-0.5-0.25376Dec 19, 2025
Updated 4d ago

Evaluation Results

MethodLinks
2025.12
-0.956-0.583
2025.12
-0.565-0.387
2025.12
-0.415-0.014
2025.12
-0.362-0.018
2025.12
-0.292-0.111
2025.12
-0.241-0.127
2025.12
-0.218-0.059
2025.12
-0.175-0.007
2025.12
-0.1450.078
2025.12
-0.1050.116
2025.12
-0.1010.164
2025.12
-0.0980.137
2025.12
-0.0810.132
2025.12
-0.080.146
2025.12
-0.0660.153
2025.12
-0.0570.149
2025.12
-0.0490.143
2025.12
-0.0440.181