Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Inference correction review (discard) on SocialIQA
Loading...
100
MHA
GPT
48
61.5
75
88.5
Apr 18, 2026
MHA
Updated 1mo ago
Evaluation Results
Method
Method
Links
MHA
GPT
pipeline=IIE pipeline
2026.04
100
Mistral
pipeline=IIE pipeline
2026.04
96.7
GPT
pipeline=IIE pipeline
2026.04
95.2
Mistral
pipeline=IIE pipeline
2026.04
83.3
Mistral
pipeline=IIE pipeline
2026.04
75
GPT
pipeline=IIE pipeline
2026.04
50
Feedback
Search any
task
Search any
task