Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Insight Generation on Internal non-scientific document collections Finance - Investment 3 (LLM Judged)
Loading...
4.73
Set-level Score (Gemini-2.5-Flash)
INSIGHTGEN
2.026
2.728
3.43
4.132
Apr 21, 2026
Set-level Score (Gemini-2.5-Flash)
Set-level Score (Claude-4-Sonnet)
Updated 1mo ago
Evaluation Results
Method
Method
Links
Set-level Score (Gemini-2.5-Flash)
Set-level Score (Claude-4-Sonnet)
INSIGHTGEN
Base Model=Claude-3.5-...
2026.04
4.73
3.65
GPT+CoT
Base Model=Claude-3.5-...
2026.04
4.2
2.82
FAISS+CoT
Base Model=Claude-3.5-...
2026.04
3.87
2.59
FAISS+CoT
Base Model=GPT-4o
2026.04
3.8
2.53
GPT+CoT
Base Model=GPT-4o
2026.04
3.73
2.67
INSIGHTGEN
Base Model=GPT-4o
2026.04
3.73
2.67
FAISS
Base Model=GPT-4o
2026.04
3.07
2.2
Direct GPT
Base Model=GPT-4o
2026.04
2.73
2.07
Direct GPT
Base Model=Claude-3.5-...
2026.04
2.4
1.65
FAISS
Base Model=Claude-3.5-...
2026.04
2.13
1.82
Feedback
Search any
task
Search any
task