Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Insight Generation on Internal non-scientific document collections Revenue & Finance Reports
Loading...
4.65
Set-level Score (Gemini-2.5-Flash)
INSIGHTGEN
1.686
2.4555
3.225
3.9945
Apr 21, 2026
Set-level Score (Gemini-2.5-Flash)
Set-level Score (Claude-4-Sonnet)
Updated 1mo ago
Evaluation Results
Method
Method
Links
Set-level Score (Gemini-2.5-Flash)
Set-level Score (Claude-4-Sonnet)
INSIGHTGEN
Base Model=Claude-3.5-...
2026.04
4.65
3.35
INSIGHTGEN
Base Model=GPT-4o
2026.04
4.45
3.25
FAISS+CoT
Base Model=Claude-3.5-...
2026.04
4.25
2.95
FAISS
Base Model=GPT-4o
2026.04
3.35
2.55
GPT+CoT
Base Model=GPT-4o
2026.04
3.25
2.45
FAISS+CoT
Base Model=GPT-4o
2026.04
3.25
2.75
GPT+CoT
Base Model=Claude-3.5-...
2026.04
3
2.05
Direct GPT
Base Model=GPT-4o
2026.04
2.7
2.15
FAISS
Base Model=Claude-3.5-...
2026.04
2.65
2.15
Direct GPT
Base Model=Claude-3.5-...
2026.04
1.8
1.45
Feedback
Search any
task
Search any
task