Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Actionable Suggestion Extraction on Manual evaluation set 1.0 (test)
Loading...
92
BERTScore
Hybrid pipeline
44.16
56.58
69
81.42
Jan 27, 2026
BERTScore
BLEURT
Exact F1
Fuzzy F1
Updated 1mo ago
Evaluation Results
Method
Method
Links
BERTScore
BLEURT
Exact F1
Fuzzy F1
Hybrid pipeline
Model Type=Supervised-...
2026.01
92
89
32
68
Prompt-only LLM
Backbone=Gemma-3
2026.01
87
84
56
70
T5-base (span)
Backbone=T5-base, Mode...
2026.01
78
76
72
73
Rule-based
Model Type=Rule-based...
2026.01
46
44
42
45
Feedback
Search any
task
Search any
task