Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Summ

Benchmarks

Task NameDataset NameSOTA ResultTrend
Prompt Injection DetectionSumm
Detection Rate (TPR/FPR)100
8
Multimodal SummarizationSumm
BLEU-45.69
5
SummarizationSumm Qwen2-7B-Instruct v1 (test)
Acceptance Length (τ)1.5
4
Prompt LocalizationSumm
RL Score0.962
3
SummarizationSumm.
Throughput (tokens/s)182.7
3
Showing 5 of 5 rows