Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Reinforcement Learning on Ant v5

6,633.8Average Return

QVPO+DBC(*)

-276.9481,517.18853,311.3255,105.4615Dec 4, 2025Dec 15, 2025Dec 26, 2025Jan 7, 2026Jan 18, 2026Jan 29, 2026Feb 10, 2026
Updated 4d ago

Evaluation Results

MethodLinks
2026.02
6,633.8
2026.02
6,501.4
2026.02
6,373.2
2026.02
6,342.6
2026.02
6,121.8
2026.02
5,306
2025.12
4,477.33
2026.02
4,257
2025.12
4,067.61
2026.02
4,000
2026.02
4,000
2026.02
3,850.4
2025.12
3,761.98
2026.02
3,750
2026.02
3,662
2026.02
3,536
2026.02
3,487.4
2026.02
3,474
2026.02
3,389
2026.02
3,190
2026.02
3,169
2026.02
3,093
2026.02
3,084
2026.02
3,082
2026.02
2,963
2026.02
2,830
2026.02
2,818
2026.02
2,792
2026.02
2,781
2026.02
2,663
2026.02
2,650.3
2025.12
2,619.72
2026.02
2,013.2
2026.02
1,225
2026.02
1,221
2026.02
994
2025.12
960.36
2025.12
959.65
2025.12
958.54
2025.12
957.37
2026.02
957
2025.12
954.13
2025.12
953.84
2025.12
948.86
2025.12
934.15
2025.12
155.64
2025.12
32.47
2025.12
17.34
2025.12
-11.15