Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Fidelity Evaluation on Prompt: "What is the meaning of life?"
Loading...
0.0265
WMSE
Claude-AI
0.026008
0.029329
0.03265
0.035971
Feb 1, 2026
WMSE
R2_w
WMAE
Mean L1 Error
Mean L2 Error
R2_w_hat
Updated 4d ago
Evaluation Results
Method
Method
Links
WMSE
R2_w
WMAE
Mean L1 Error
Mean L2 Error
R2_w_hat
Claude-AI
Surrogate Model=Weight...
2026.02
0.0265
0.6967
0.1251
0.1981
0.0708
0.624
LLaMA
Surrogate Model=Weight...
2026.02
0.0368
0.7068
0.1617
0.2387
0.0805
0.6364
OpenAI-GPT
Surrogate Model=Weight...
2026.02
0.0388
0.7104
0.1731
0.2035
0.0609
0.6409
Feedback
Search any
task
Search any
task