Share your thoughts, 1 month free Claude Pro on usSee more

Tool Selection Quality on Representative guardrail dataset

95F1 Score

ChainPoll

Updated 5mo ago

Evaluation Results

Method	Links
ChainPoll 2026.02		95
Luna-2 2026.02		94
Single token 2026.02		57