Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Human Preference Evaluation on User Study (Group 2)
Loading...
24
Category CT Score
UAV-GPT
6.32
10.91
15.5
20.09
Dec 9, 2025
Category CT Score
Category CI Score
Category ST Score
Category SI Score
Survey Q3 Response Count
Survey Q4 Response Count
Survey Q5 Response Count
Updated 1mo ago
Evaluation Results
Method
Method
Links
Category CT Score
Category CI Score
Category ST Score
Category SI Score
Survey Q3 Response Count
Survey Q4 Response Count
Survey Q5 Response Count
UAV-GPT
2025.12
24
23
27
21
120
130
140
CaP
2025.12
16
23
10
16
20
40
30
PromptCraft
2025.12
7
5
12
16
60
30
30
Feedback
Search any
task
Search any
task