Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Reasoning Quality Evaluation on AgriChain 1.0 (test)
Loading...
4.6
Faithfulness
AgriChain-VL3B
3.456
3.753
4.05
4.347
Apr 9, 2026
Faithfulness
Informativeness
Factual Accuracy
Coherence
Domain Relevance
Commonsense Score
Average Score
Updated 8d ago
Evaluation Results
Method
Method
Links
Faithfulness
Informativeness
Factual Accuracy
Coherence
Domain Relevance
Commonsense Score
Average Score
AgriChain-VL3B
2026.04
4.6
4.6
4.7
4.5
4.8
4.6
4.63
Gemini 2.5 Pro
2026.04
4.5
4.4
4.6
4.5
4.7
4.6
4.55
GPT-4o Mini
2026.04
4.5
4.3
4.6
4.4
4.6
4.7
4.52
Gemini 1.5 Flash
2026.04
4.3
4.2
4.4
4.3
4.5
4.5
4.37
Qwen-2.5-VL
variant=base
2026.04
3.5
3.2
3.4
3.3
3
4
3.4
Feedback
Search any
task
Search any
task