Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Natural Language Inference on CrossFit NLI (test)
Loading...
83.6
Accuracy
ABMLL
56.56
63.58
70.6
77.62
Aug 19, 2025
Accuracy
ECE
Updated 16d ago
Evaluation Results
Method
Method
Links
Accuracy
ECE
ABMLL
Model=QWEN2
2025.08
83.6
20.8
Reptile
Model=LLAMA3
2025.08
83.3
24.2
Reptile
Model=QWEN2
2025.08
82.6
23.1
ABMLL
Model=LLAMA3
2025.08
82.2
23.7
Regular LoRA
Model=QWEN2
2025.08
79.7
26.9
Struct. LoRA
Model=QWEN2
2025.08
79.1
27
Regular LoRA
Model=LLAMA3
2025.08
78.5
31
Struct. LoRA
Model=LLAMA3
2025.08
75.5
30.2
Pretrained
Model=QWEN2
2025.08
69.3
29.8
Pretrained
Model=LLAMA3
2025.08
57.6
41.9
Feedback
Search any
task
Search any
task