Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

ProntoQA

Benchmarks

Task NameDataset NameSOTA ResultTrend
Logical ReasoningProntoQA (test)
Accuracy99.72
36
Veracity InferencePRONTOQA (1,000 examples)
Mean Hamming Similarity96.4
20
Deductive ReasoningProntoQA
Pass@10.964
18
Explanation RefinementPrOntoQA
Initial Score0.98
15
ReasoningProntoQA
Acc95
14
Deductive logical reasoningProntoQA (test)
Error Rate2.8
12
ReasoningPrOntoQA
PrOntoQA Score97.88
10
Logical ReasoningPrOntoQA
Calibrated Accuracy63.8
8
Reasoning accuracyPRONTOQA 5-hop
Accuracy81
6
Reasoning accuracyPRONTOQA 4-hop
Accuracy85
6
Reasoning accuracyPRONTOQA 3-hop
Accuracy87
6
Veracity InferencePRONTOQA 5-hop (test)
Hamming Similarity0.955
4
Veracity InferencePRONTOQA 4-hop (test)
Hamming Similarity96.7
4
Veracity InferencePRONTOQA 3-hop (test)
Hamming Similarity95.6
4
Logical ReasoningPrOntoQA
Accuracy100
3
Logical ReasoningProntoQA Enhanced
OA99.8
1
Deductive logical reasoningProntoQA OOD 500 records (test)
ExcRate-
0
Showing 17 of 17 rows