Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Context-aware Instruction Following on Academic Paper Abstracts (test)
Loading...
8.64
Overlap (w)
GPT-4-turbo
8.016
8.178
8.34
8.502
Mar 5, 2024
Overlap (w)
Perplexity
Overlap (w/o)
Updated 4d ago
Evaluation Results
Method
Method
Links
Overlap (w)
Perplexity
Overlap (w/o)
GPT-4-turbo
Mode=zero-shot LLM wit...
2024.03
8.64
8.47
8.68
Qwen-Chat(v1.5) + Qwen-Chat(v1.5) (Logit-based CoGenesis)
Mode=mixed-scale colla...
2024.03
8.04
7.7
8.34
Feedback
Search any
task
Search any
task