Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Long-context evaluation (Financial) on Loong Fin
Loading...
58.8
Fin Judge Score
RR+Judge(y)
38.728
43.939
49.15
54.361
Jan 26, 2026
Fin Judge Score
Updated 4d ago
Evaluation Results
Method
Method
Links
Fin Judge Score
RR+Judge(y)
Context Type=Full Cont...
2026.01
58.8
RAO(y)
Context Type=Full Cont...
2026.01
55.3
RID+Q(y)
Context Type=Full Cont...
2026.01
51.5
RID(y)
Context Type=Full Cont...
2026.01
51.2
Base
Context Type=Full Cont...
2026.01
49.9
RAG
Context Length=32k
2026.01
47.8
RID+C(y)
Context Type=Full Cont...
2026.01
47.5
RID(y)
Context Type=KV-Cache...
2026.01
43
RAO(y)
Context Type=KV-Cache...
2026.01
41.7
RR+Judge(y)
Context Type=KV-Cache...
2026.01
41.5
RID+C(y)
Context Type=KV-Cache...
2026.01
39.8
RID+Q(y)
Context Type=KV-Cache...
2026.01
39.7
Base
Context Type=KV-Cache...
2026.01
39.5
Feedback
Search any
task
Search any
task