Share your thoughts, 1 month free Claude Pro on usSee more

Scenario-based Filter Generation Benchmark

18.98ROUGE-1

llama 3.2 3B

Updated 1mo ago

Evaluation Results

Method	Links
llama 3.2 3B 2025.11		18.98	7.6	13.98	88.38	61.03	3.49
phi-4-mini 3.8B 2025.11		18.67	7.87	13.54	88.97	66.63	4.07
gpt-4o-mini 2025.11		17.87	7.46	12.73	89.5	73.53	3.28
Gemma3 4B 2025.11		10.19	3.67	7.15	87.63	61.53	1.33