Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Tool-agent-user interaction on TAC (TheAgentCompany)
Loading...
29
Pass@3
LLM Agent (3-model subset, forced-injection)
-1.16
6.67
14.5
22.33
May 8, 2026
Pass@3
Updated 23d ago
Evaluation Results
Method
Method
Links
Pass@3
LLM Agent (3-model subset, forced-injection)
Dimension=Input, Injec...
2026.05
29
LLM Agent (3-model subset, forced-injection)
Dimension=Input, Injec...
2026.05
25
LLM Agent (3-model subset, forced-injection)
Dimension=Input, Injec...
2026.05
23
LLM Agent (3-model subset, forced-injection)
Dimension=Input, Injec...
2026.05
21
LLM Agent (3-model subset, forced-injection)
Dimension=Input, Injec...
2026.05
21
LLM Agent (3-model subset, forced-injection)
Dimension=Input, Injec...
2026.05
21
LLM Agent (3-model subset, forced-injection)
Dimension=Goal, Inject...
2026.05
20
LLM Agent (3-model subset, forced-injection)
Dimension=Input, Injec...
2026.05
19
LLM Agent (3-model subset, forced-injection)
Dimension=Goal, Inject...
2026.05
12
LLM Agent (3-model subset, forced-injection)
Dimension=Goal, Inject...
2026.05
12
LLM Agent (3-model subset, forced-injection)
Dimension=Goal, Inject...
2026.05
11
LLM Agent (3-model subset, forced-injection)
Dimension=Goal, Inject...
2026.05
10
LLM Agent (3-model subset, forced-injection)
Dimension=Goal, Inject...
2026.05
8
LLM Agent (3-model subset, forced-injection)
Dimension=Goal, Inject...
2026.05
8
LLM Agent (3-model subset, forced-injection)
Dimension=Constraint,...
2026.05
0
LLM Agent (3-model subset, forced-injection)
Dimension=Constraint,...
2026.05
0
LLM Agent (3-model subset, forced-injection)
Dimension=Constraint,...
2026.05
0
LLM Agent (3-model subset, forced-injection)
Dimension=Constraint,...
2026.05
0
LLM Agent (3-model subset, forced-injection)
Dimension=Constraint,...
2026.05
0
LLM Agent (3-model subset, forced-injection)
Dimension=Constraint,...
2026.05
0
LLM Agent (3-model subset, forced-injection)
Dimension=Constraint,...
2026.05
0
Feedback
Search any
task
Search any
task