Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Knowledge Reasoning on GPQA Diamond (Score)
Loading...
37.3
Score
ExpertWeaver
33.14
34.22
35.3
36.38
Feb 17, 2026
Score
Updated 3mo ago
Evaluation Results
Method
Method
Links
Score
ExpertWeaver
Base Model=DeepSeek-R1...
2026.02
37.3
FLAP
Base Model=DeepSeek-R1...
2026.02
36.3
LLM-Pruner
Base Model=DeepSeek-R1...
2026.02
33.3
Feedback
Search any
task
Search any
task