Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Free-language reasoning on RoboFAC (Real-world)
Loading...
33.8
ROUGE-L (TI)
KITE+Qwen2.5-7B+QLoRA
26.104
28.102
30.1
32.098
Apr 8, 2026
ROUGE-L (TI)
ROUGE-L (FE)
ROUGE-L (HL)
ROUGE-L (LL)
SBERT Cosine (TI)
SBERT Cosine (FE)
SBERT Cosine (HL)
SBERT Cosine (LL)
Updated 1mo ago
Evaluation Results
Method
Method
Links
ROUGE-L (TI)
ROUGE-L (FE)
ROUGE-L (HL)
ROUGE-L (LL)
SBERT Cosine (TI)
SBERT Cosine (FE)
SBERT Cosine (HL)
SBERT Cosine (LL)
KITE+Qwen2.5-7B+QLoRA
Evidence Representatio...
2026.04
33.8
36.5
22.9
31.3
0.724
0.86
0.798
0.815
RoboFAC-7B
Finetuned=true
2026.04
33.7
36.1
22.8
30.5
0.722
0.856
0.798
0.813
KITE + Qwen2.5-VL-7B
Evidence Representatio...
2026.04
30
25.2
22.3
23.2
0.696
0.832
0.791
0.804
Qwen2.5-VL-7B
2026.04
26.4
23.3
21.9
19.7
0.689
0.786
0.792
0.785
Feedback
Search any
task
Search any
task