Share your thoughts, 1 month free Claude Pro on usSee more

Question Answering on 2Wiki (In-Distribution)

77Accuracy

GPT-4.1

Updated 3mo ago

Evaluation Results

Method	Links
GPT-4.1 2025.07		77	81	18
Deliberative Searcher-DeepSeek-70B 2025.07		65	69	5
Deliberative Searcher-72B 2025.07		64	73	4
Claude Sonnet 4 2025.07		61	89	8
Deliberative Searcher-7B 2025.07		55	59	3
Deliberative Searcher-7B 2025.07		52	54	6
GPT-4o 2025.07		51	86	10
R1-Searcher-7B 2025.07		48	59	40
DeepSeek-R1-Distill-70B 2025.07		46	55	44
InternVL3-78B 2025.07		43	62	36
Qwen2.5-VL-72B 2025.07		41	73	23
Search-R1-7B 2025.07		36	51	43
Qwen2.5-VL-7B 2025.07		33	51	48
ReSearch-7B 2025.07		33	35	65