Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Medical Reasoning on Medical Reasoning
Loading...
59.7
pass@1
FedAvg-GRPO
48.78
51.615
54.45
57.285
Apr 14, 2026
pass@1
Updated 3d ago
Evaluation Results
Method
Method
Links
pass@1
FedAvg-GRPO
Local steps (τ)=10, Mo...
2026.04
59.7
FedAvg-PubSwap
Local steps (τ)=40, Mo...
2026.04
59.5
FedAvg-GRPO
Local steps (τ)=40, Mo...
2026.04
58.9
FedAvg-PubSwap
Local steps (τ)=90, Mo...
2026.04
58.5
FedAvg-PubSwap
Local steps (τ)=10, Mo...
2026.04
58.3
FedAvg-PubSwap
Local steps (τ)=120, M...
2026.04
58.1
FedAvg-GRPO
Local steps (τ)=90, Mo...
2026.04
57.9
FedAvg-GRPO
Local steps (τ)=120, M...
2026.04
57.7
Base model
Local steps (τ)=10, Mo...
2026.04
49.2
Base model
Local steps (τ)=40, Mo...
2026.04
49.2
Base model
Local steps (τ)=90, Mo...
2026.04
49.2
Base model
Local steps (τ)=120, M...
2026.04
49.2
Feedback
Search any
task
Search any
task