Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

HerO at AVeriTeC: The Herd of Open Large Language Models for Verifying Real-World Claims

About

To tackle the AVeriTeC shared task hosted by the FEVER-24, we introduce a system that only employs publicly available large language models (LLMs) for each step of automated fact-checking, dubbed the Herd of Open LLMs for verifying real-world claims (HerO). For evidence retrieval, a language model is used to enhance a query by generating hypothetical fact-checking documents. We prompt pretrained and fine-tuned LLMs for question generation and veracity prediction by crafting prompts with retrieved in-context samples. HerO achieved 2nd place on the leaderboard with the AVeriTeC score of 0.57, suggesting the potential of open LLMs for verifying real-world claims. For future research, we make our code publicly available at https://github.com/ssu-humane/HerO.

Yejun Yoon, Jaeyoon Jung, Seunghyun Yoon, Kunwoo Park• 2024

Related benchmarks

TaskDatasetResultRank
Claim VerificationAVeriTeC Retrieved (H) (dev)
Accuracy70.2
28
Claim VerificationAVeriTeC Retrieved (I) (dev)
Accuracy67.8
28
Claim VerificationAVeriTeC Golden (dev)
Accuracy80.4
28
Fact CheckingFEVER
Balanced Accuracy67.5
12
Fact CheckingAVeriTeC (test)
Hu-METEOR (Q only)0.48
9
Justification Quality EvaluationAVeriTeC Retrieved (H) 50 correctly verified claims
MOS2.6
6
Showing 6 of 6 rows

Other info

Code

Follow for update