Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Semantic Representation Evaluation on UpWork Narrative Situations Tech. Development (test)
Loading...
96
Preference Rate
Operator
0.32
25.16
50
74.84
Nov 10, 2025
Preference Rate
Updated 1mo ago
Evaluation Results
Method
Method
Links
Preference Rate
Operator
Compared Baseline=GLEN
2025.11
96
Operator
Compared Baseline=BERT...
2025.11
96
Operator
Compared Baseline=FST
2025.11
86
FST
Compared Against=Operator
2025.11
14
GLEN
Compared Against=Operator
2025.11
4
BERT-SRL
Compared Against=Operator
2025.11
4
Feedback
Search any
task
Search any
task