Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Factual Knowledge Evaluation on DyKnow 130 time-sensitive facts Wikidata-derived

80Correctness

GPT-4

8.2426.8745.564.13Jan 22, 2025
Updated 1mo ago

Evaluation Results

MethodLinks
2025.01
80137
2025.01
761410
2025.01
62299
2025.01
57367
2025.01
57358
2025.01
53398
2025.01
523315
2025.01
523216
2025.01
51427
2025.01
513712
2025.01
484210
2025.01
444115
2025.01
424712
2025.01
424711
2025.01
424216
2025.01
414613
2025.01
374023
2025.01
354916
2025.01
353629
2025.01
354718
2025.01
264232
2025.01
183943
2025.01
122861
2025.01
112168