Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

BERT for Evidence Retrieval and Claim Verification

About

Motivated by the promising performance of pre-trained language models, we investigate BERT in an evidence retrieval and claim verification pipeline for the FEVER fact extraction and verification challenge. To this end, we propose to use two BERT models, one for retrieving potential evidence sentences supporting or rejecting claims, and another for verifying claims based on the predicted evidence sets. To train the BERT retrieval system, we use pointwise and pairwise loss functions, and examine the effect of hard negative mining. A second BERT model is trained to classify the samples as supported, refuted, and not enough information. Our system achieves a new state of the art recall of 87.1 for retrieving top five sentences out of the FEVER documents consisting of 50K Wikipedia pages, and scores second in the official leaderboard with the FEVER score of 69.7.

Amir Soleimani, Christof Monz, Marcel Worring• 2019

Related benchmarks

TaskDatasetResultRank
Fact VerificationFEVER 1.0 (dev)
Label Accuracy74.59
23
Fact CheckingFEVEROUS (test)
Macro F151.67
20
Fact CheckingHOVER 2-hop (test)
Macro F150.68
16
Fact CheckingHOVER 3-hop (test)
Macro F149.86
16
Fact CheckingHOVER 4-hop (test)
Macro F148.57
16
Fact VerificationFEVER 1.0 (test)
Label Accuracy71.86
14
Fact CheckingHOVER
Macro F1 (2-hop)50.68
12
Fact CheckingFEVEROUS-S
Macro F151.67
12
Showing 8 of 8 rows

Other info

Follow for update