Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
LLM Alignment on Taobao Live proprietary fine-grained preference dataset
Loading...
1.53
Win Score
PD (ours)
0.438
0.7215
1.005
1.2885
Aug 11, 2025
Win Score
Updated 1mo ago
Evaluation Results
Method
Method
Links
Win Score
PD (ours)
Budget (λ)=50%
2025.08
1.53
PD (ours)
Budget (λ)=40%
2025.08
1.45
PD (ours)
Budget (λ)=30%
2025.08
1.43
RAND
Budget (λ)=40%
2025.08
1.37
PD (mid)
Budget (λ)=50%
2025.08
1.37
ALL
Budget (λ)=100%
2025.08
1.33
PD (mid)
Budget (λ)=30%
2025.08
1.33
RAND
Budget (λ)=50%
2025.08
1.32
PD (mid)
Budget (λ)=40%
2025.08
1.32
RAND
Budget (λ)=30%
2025.08
1.18
PD (high)
Budget (λ)=50%
2025.08
0.93
PD (high)
Budget (λ)=40%
2025.08
0.57
PD (high)
Budget (λ)=30%
2025.08
0.48
Feedback
Search any
task
Search any
task