Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Semantic Consistency on Consistency evaluation suite N=720 (test)

0.583Semantic Consistency Score

Claude-3.5-Haiku

0.200280.299640.3990.49836Nov 27, 2025
Updated 4d ago

Evaluation Results

MethodLinks
2025.11
0.583
2025.11
0.565
2025.11
0.547
2025.11
0.537
2025.11
0.535
2025.11
0.535
2025.11
0.508
2025.11
0.498
2025.11
0.481
2025.11
0.476
2025.11
0.464
2025.11
0.464
2025.11
0.463
2025.11
0.462
2025.11
0.456
2025.11
0.454
2025.11
0.45
2025.11
0.443
2025.11
0.432
2025.11
0.427
2025.11
0.424
2025.11
0.423
2025.11
0.416
2025.11
0.414
2025.11
0.406
2025.11
0.404
2025.11
0.402
2025.11
0.39
2025.11
0.387
2025.11
0.38
2025.11
0.377
2025.11
0.354
2025.11
0.354
2025.11
0.348
2025.11
0.342
2025.11
0.339
2025.11
0.338
2025.11
0.337
2025.11
0.336
2025.11
0.326
2025.11
0.319
2025.11
0.314
2025.11
0.3
2025.11
0.292
2025.11
0.283
2025.11
0.283
2025.11
0.279
2025.11
0.264
2025.11
0.263
2025.11
0.263
2025.11
0.257
2025.11
0.246
2025.11
0.222
2025.11
0.215