Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Three-Level Classification on JD e-commerce search dataset
Loading...
81.26
Macro F1
Graph-GRPO
55.3952
62.1101
68.825
75.5399
May 29, 2026
Macro F1
Weighted F1
Accuracy
Updated 2d ago
Evaluation Results
Method
Method
Links
Macro F1
Weighted F1
Accuracy
Graph-GRPO
2026.05
81.26
83.84
84.44
Graph-GRPO
learnable_coefficients...
2026.05
81.22
83.8
84.41
LLM-GRPO
base_model=LLM-SFT
2026.05
80.91
83.57
84.21
LLM-SFT
2026.05
80.33
83.13
83.84
LLM-SFT
random_masking=false
2026.05
79.55
82.47
83.19
LLM-Base
2026.05
56.39
61.34
61.54
Feedback
Search any
task
Search any
task