Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Role-Playing on RPGBench User Shift Generalization

-0.016RP Score (German)

CoRL

-0.51-0.38175-0.2535-0.12525Dec 19, 2025
Updated 4d ago

Evaluation Results

MethodLinks
2025.12
-0.016-0.033-0.113-0.036-0.002
2025.12
-0.017-0.077-0.315-0.162-0.12
2025.12
-0.019-0.058-0.133-0.083-0.047
2025.12
-0.036-0.055-0.104-0.076-0.045
2025.12
-0.04-0.118-0.345-0.201-0.144
2025.12
-0.047-0.051-0.118-0.096-0.07
2025.12
-0.048-0.115-0.364-0.2-0.159
2025.12
-0.049-0.108-0.335-0.198-0.144
2025.12
-0.065-0.092-0.373-0.206-0.145
2025.12
-0.085-0.116-0.32-0.087-0.066
2025.12
-0.113-0.219-0.442-0.243-0.186
2025.12
-0.125-0.218-0.387-0.159-0.091
2025.12
-0.13-0.204-0.492-0.296-0.259
2025.12
-0.206-0.185-0.269-0.156-0.292
2025.12
-0.221-0.245-0.43-0.295-0.225
2025.12
-0.312-0.371-0.678-0.421-0.273
2025.12
-0.371-0.463-0.881-0.489-0.33
2025.12
-0.491-0.671-0.853-0.628-0.427