Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Text-based agent interaction on TextWorld Quest (test)
Loading...
88
Accuracy
Agent-BRACE
0.64
23.32
46
68.68
May 12, 2026
Accuracy
Steps Taken
Updated 21d ago
Evaluation Results
Method
Method
Links
Accuracy
Steps Taken
Agent-BRACE
Backbone=Qwen3-4B-Inst...
2026.05
88
30.5
PABU
Backbone=Qwen3-4B-Inst...
2026.05
82.2
29.1
Agent-BRACE
Backbone=Qwen2.5-3B-In...
2026.05
78.5
37.3
ReAct (RL)
Backbone=Qwen3-4B-Inst...
2026.05
75.5
18.2
Direct-Action (RL)
Backbone=Qwen3-4B-Inst...
2026.05
74
29.6
PABU
Backbone=Qwen2.5-3B-In...
2026.05
73
37
Base Model
Backbone=Qwen3-4B-Inst...
2026.05
61.5
32.3
MEM1
Backbone=Qwen3-4B-Inst...
2026.05
61.5
50.2
ReAct
Backbone=Qwen3-4B-Inst...
2026.05
60.5
12.6
Direct-Action (RL)
Backbone=Qwen2.5-3B-In...
2026.05
56
35.8
ReAct (RL)
Backbone=Qwen2.5-3B-In...
2026.05
46.5
34.2
MEM1
Backbone=Qwen2.5-3B-In...
2026.05
29.5
62.9
ReAct
Backbone=Qwen2.5-3B-In...
2026.05
23
37.6
Base Model
Backbone=Qwen2.5-3B-In...
2026.05
4
96.1
Feedback
Search any
task
Search any
task