Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Prefix-risk ranking on tau^2-Bench (held-out)
Loading...
71
AUPRC
PrefixGuard (Transformer)
20.248
33.424
46.6
59.776
May 7, 2026
AUPRC
Updated 26d ago
Evaluation Results
Method
Method
Links
AUPRC
PrefixGuard (Transformer)
Input view=StepView, H...
2026.05
71
PrefixGuard (GRU)
Input view=StepView, H...
2026.05
69.6
PrefixGuard (FSM)
Input view=StepView, H...
2026.05
61.4
Transformer (Raw-text control)
Input view=Raw text, H...
2026.05
59.7
GRU (Raw-text control)
Input view=Raw text, H...
2026.05
55.4
FSM (Raw-text control)
Input view=Raw text, H...
2026.05
46.6
DeepSeek-V4-Pro
Input view=Prompt, Hea...
2026.05
39.6
PrefixGuard (DFA)
Input view=StepView, H...
2026.05
31.6
GPT-5.4-mini
Input view=Prompt, Hea...
2026.05
30.2
PPM LSTM
Input view=StepView ac...
2026.05
23.1
DFA (Raw-text control)
Input view=Raw text, H...
2026.05
22.2
Feedback
Search any
task
Search any
task