Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Input-bound consistency on Subject and Property Perturbations dataset
Loading...
38
Subject Agreement (%)
ENtity-Aware Finetuning
-1.52
8.74
19
29.26
Jan 22, 2025
Subject Agreement (%)
Property Agreement (%)
Updated 1mo ago
Evaluation Results
Method
Method
Links
Subject Agreement (%)
Property Agreement (%)
ENtity-Aware Finetuning
Model=GPT-2, Represent...
2025.01
38
22
ENtity-Aware Finetuning
Model=OLMo 1B, Represe...
2025.01
33
9
ENtity-Aware Finetuning
Model=GPT-2, Represent...
2025.01
31
28
ENtity-Aware Finetuning
Model=GPT-2, Represent...
2025.01
29
33
Pre-Trained Model
Model=GPT-2
2025.01
28
24
ENtity-Aware Finetuning
Model=OLMo 1B, Represe...
2025.01
22
0
ENtity-Aware Finetuning
Model=GPT-2, Represent...
2025.01
18
26
Pre-Trained Model
Model=OLMo 1B
2025.01
13
44
Vanilla Finetuning
Model=GPT-2
2025.01
12
30
Vanilla Finetuning
Model=OLMo 1B
2025.01
11
33
ENtity-Aware Finetuning
Model=OLMo 1B, Represe...
2025.01
8
6
ENtity-Aware Finetuning
Model=OLMo 1B, Represe...
2025.01
6
35
ENtity-Aware Finetuning
Model=OLMo 1B, Represe...
2025.01
6
43
ENtity-Aware Finetuning
Model=GPT-2, Represent...
2025.01
0
50
Feedback
Search any
task
Search any
task