Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Sequential Decision Making on Tag

-2.3Average Reward

Perfect Agent

-14.156-11.078-8-4.922Nov 15, 2025
Updated 8d ago

Evaluation Results

MethodLinks
2025.11
-2.3
2025.11
-2.3
2025.11
-2.3
2025.11
-2.8
2025.11
-2.8
2025.11
-2.8
2025.11
-3.1
2025.11
-3.3
2025.11
-3.9
2025.11
-4.9
2025.11
-5
2025.11
-5
2025.11
-5
2025.11
-5.1
2025.11
-5.7
2025.11
-7.7
2025.11
-8
2025.11
-8
2025.11
-8
2025.11
-8
2025.11
-8.3
2025.11
-8.9
2025.11
-9.7
2025.11
-10.4
2025.11
-10.7
2025.11
-10.7
2025.11
-10.7
2025.11
-12.7
2025.11
-13.2
2025.11
-13.7