Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Question Answering on ChronoTQA Tier 1 (147 Phenopackets-grounded onset questions)
Loading...
86.4
Accuracy
DeepSeek V3
67.992
72.771
77.55
82.329
May 21, 2026
Accuracy
ChronoMedKG-NR
Updated 12d ago
Evaluation Results
Method
Method
Links
Accuracy
ChronoMedKG-NR
DeepSeek V3
Retrieval Condition=NR
2026.05
86.4
-
DeepSeek V3
Retrieval Condition=Ch...
2026.05
85
1.4
DeepSeek V3
Retrieval Condition=HPOA
2026.05
83.7
-
Pooled (n=441)
Retrieval Condition=Ch...
2026.05
82.1
2.3
Claude 3 Haiku
Retrieval Condition=Ch...
2026.05
81.6
5.4
DeepSeek V3
Retrieval Condition=Pr...
2026.05
81.6
-
GPT-4o-mini
Retrieval Condition=Pr...
2026.05
81.6
-
GPT-4o-mini
Retrieval Condition=HPOA
2026.05
81
-
Pooled (n=441)
Retrieval Condition=HPOA
2026.05
80.1
-
Pooled (n=441)
Retrieval Condition=NR
2026.05
79.8
-
GPT-4o-mini
Retrieval Condition=Ch...
2026.05
79.6
2.7
Pooled (n=441)
Retrieval Condition=Pr...
2026.05
77.3
-
GPT-4o-mini
Retrieval Condition=NR
2026.05
76.9
-
Claude 3 Haiku
Retrieval Condition=NR
2026.05
76.2
-
Claude 3 Haiku
Retrieval Condition=HPOA
2026.05
75.5
-
Claude 3 Haiku
Retrieval Condition=Pr...
2026.05
68.7
-
Feedback
Search any
task
Search any
task