Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Irrelevant Classification on JD e-commerce search dataset
Loading...
84.58
Precision
LLM-Base
76.2912
78.4431
80.595
82.7469
May 29, 2026
Precision
Recall
F1 Score
Updated 2d ago
Evaluation Results
Method
Method
Links
Precision
Recall
F1 Score
LLM-Base
2026.05
84.58
29.7
43.97
Graph-GRPO
learnable_coefficients...
2026.05
80.39
89.57
84.73
Graph-GRPO
2026.05
80.11
89.95
84.75
LLM-GRPO
base_model=LLM-SFT
2026.05
79.76
89.9
84.53
LLM-SFT
2026.05
77.95
90.94
83.94
LLM-SFT
random_masking=false
2026.05
76.61
90.71
83.06
Feedback
Search any
task
Search any
task