Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Conditional-independence (belief sufficiency) testing on Heart Expert-Constructed Bayesian Network
Loading...
0.1454
CMI
GPT-Min
0.064384
0.085417
0.10645
0.127483
Feb 6, 2026
CMI
CMI 95% CI Lower Bound
% Log-Loss Improvement (A~p vs A~(p,θ))
% Log-Loss Improvement CI Lower Bound (A~p vs A~(p,θ))
% Log-Loss Improvement (A~(p,x) vs A~(p,x,θ))
% Log-Loss Improvement CI Lower Bound (A~(p,x) vs A~(p,x,θ))
Updated 4d ago
Evaluation Results
Method
Method
Links
CMI
CMI 95% CI Lower Bound
% Log-Loss Improvement (A~p vs A~(p,θ))
% Log-Loss Improvement CI Lower Bound (A~p vs A~(p,θ))
% Log-Loss Improvement (A~(p,x) vs A~(p,x,θ))
% Log-Loss Improvement CI Lower Bound (A~(p,x) vs A~(p,x,θ))
GPT-Min
Model=GPT-Min
2026.02
0.1454
-
16.37
-
13.05
-
GPT-High
Model=GPT-High
2026.02
0.0753
-
4.23
-
3.82
-
Llama
Model=Llama
2026.02
0.0718
-
2.98
-
1.02
-
DeepSeek
Model=DeepSeek
2026.02
0.0675
-
4.52
-
2.03
-
Feedback
Search any
task
Search any
task