Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Overall Performance on Aggregated All Benchmarks
Loading...
40.3
Average Score
DiReCT
29.796
32.523
35.25
37.977
May 29, 2026
Average Score
Updated 2d ago
Evaluation Results
Method
Method
Links
Average Score
DiReCT
Backbone=Llama-1.1B
2026.05
40.3
InfoBatch
Backbone=Llama-1.1B
2026.05
40.1
GradNorm (IS)
Backbone=Llama-1.1B
2026.05
38.9
Loss-based
Backbone=Llama-1.1B
2026.05
38.3
Perplexity-based
Backbone=Llama-1.1B
2026.05
38.2
Uniform Sampling
Backbone=Llama-1.1B
2026.05
37.9
DiReCT
Backbone=GPT-2-Medium...
2026.05
32.3
InfoBatch
Backbone=GPT-2-Medium...
2026.05
32.1
GradNorm (IS)
Backbone=GPT-2-Medium...
2026.05
31.3
Perplexity-based
Backbone=GPT-2-Medium...
2026.05
31
Loss-based
Backbone=GPT-2-Medium...
2026.05
30.4
Uniform Sampling
Backbone=GPT-2-Medium...
2026.05
30.2
Feedback
Search any
task
Search any
task