Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Text Similarity on Insurance tasks HQ subset N = 1334

89.9Mean Score

DeepSeek-R1 + Fine-tune

72.94877.34981.7586.151Feb 18, 2026
Updated 4d ago

Evaluation Results

MethodLinks
2026.02
89.90.13294.610086.3
2026.02
81.70.14686.797.773.3
2026.02
80.60.14985.697.470.5
2026.02
77.60.15981.99762.4
2026.02
76.70.15280.997.760.5
2026.02
73.60.15577.310050.7