Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Metaphor on VUA
Loading...
98.8
AUROC
Full Rep.
92.872
94.411
95.95
97.489
Apr 20, 2026
AUROC
Updated 1mo ago
Evaluation Results
Method
Method
Links
AUROC
Full Rep.
Model=Gemma2-9B, Repre...
2026.04
98.8
Full Rep.
Model=GPT-OSS-20B, Rep...
2026.04
98.8
Full Representation Classifier
Backbone=Llama-3.1-8B,...
2026.04
97.6
Full Rep.
Model=Qwen3-8B, Repres...
2026.04
97
Subspace
Model=Gemma2-9B, Repre...
2026.04
96.7
one-directional concreteness axis
Backbone=Llama-3.1-8B,...
2026.04
95.7
Subspace
Model=GPT-OSS-20B, Rep...
2026.04
94.1
Subspace
Model=Qwen3-8B, Repres...
2026.04
93.1
Feedback
Search any
task
Search any
task