Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Readability and difficulty-controlled text generation on OSE
Loading...
0.8453
Accuracy
SWAI
0.266436
0.416718
0.567
0.717282
Jan 16, 2026
Accuracy
F1 Score
Precision
Recall
Confidence Score
Cohen's Kappa (κ)
Matthews Corr. Coeff. (MCC)
Updated 4d ago
Evaluation Results
Method
Method
Links
Accuracy
F1 Score
Precision
Recall
Confidence Score
Cohen's Kappa (κ)
Matthews Corr. Coeff. (MCC)
SWAI
Model=Llama3.1 8B
2026.01
0.8453
0.845
0.8453
0.8453
0.847
0.768
0.768
SWAI
Model=Llama3.2 1B
2026.01
0.7111
0.712
0.7127
0.7111
0.805
0.567
0.567
Baseline
Model=Llama3.1 8B
2026.01
0.3711
0.363
0.362
0.3648
0.846
0.052
0.052
Baseline
Model=Llama3.2 1B
2026.01
0.2887
0.289
0.2936
0.2899
0.826
-0.063
-0.064
Feedback
Search any
task
Search any
task