Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Binary Inconsistency Detection on LLM

70.27Accuracy

ChatGPT-SpanMoE

38.622846.838955.05563.2711Jun 5, 2024
Updated 4d ago

Evaluation Results

MethodLinks
2024.06
70.27
2024.06
67.96
2024.06
64.84
2024.06
63.89
2024.06
61.61
2024.06
60.34
2024.06
49.7
2024.06
49.47
46.92
2024.06
39.84