Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Insight Generation on Internal non-scientific document collections (Legal & Regulatory Compliance)
Loading...
4.35
Set-level Score (Gemini-2.5-Flash)
INSIGHTGEN
1.854
2.502
3.15
3.798
Apr 21, 2026
Set-level Score (Gemini-2.5-Flash)
Set-level Score (Claude-4-Sonnet)
Updated 1mo ago
Evaluation Results
Method
Method
Links
Set-level Score (Gemini-2.5-Flash)
Set-level Score (Claude-4-Sonnet)
INSIGHTGEN
Base Model=Claude-3.5-...
2026.04
4.35
2.95
INSIGHTGEN
Base Model=GPT-4o
2026.04
4.25
3
GPT+CoT
Base Model=Claude-3.5-...
2026.04
4.05
2.5
FAISS
Base Model=GPT-4o
2026.04
3.8
2.9
FAISS+CoT
Base Model=Claude-3.5-...
2026.04
3.6
2.5
Direct GPT
Base Model=GPT-4o
2026.04
3.4
2.5
FAISS+CoT
Base Model=GPT-4o
2026.04
3.15
2.05
GPT+CoT
Base Model=GPT-4o
2026.04
2.8
2.1
FAISS
Base Model=Claude-3.5-...
2026.04
2.1
1.95
Direct GPT
Base Model=Claude-3.5-...
2026.04
1.95
1.25
Feedback
Search any
task
Search any
task