Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Comprehensiveness on microsoft
Loading...
48
Wins
M2hC
37.6
40.3
43
45.7
Mar 5, 2026
Wins
Losses
Ties
Updated 1mo ago
Evaluation Results
Method
Method
Links
Wins
Losses
Ties
M2hC
LLM Evaluator=GPT-5-mi...
2026.03
48
42
10
M2hC
LLM Evaluator=GPT-5-mi...
2026.03
47
48
5
MRC
LLM Evaluator=GPT-5-mi...
2026.03
45
40
15
MRC
LLM Evaluator=GPT-5-mi...
2026.03
45
40
15
RkH
LLM Evaluator=GPT-5-mi...
2026.03
42
48
10
RkH
LLM Evaluator=GPT-5-mi...
2026.03
38
43
19
Feedback
Search any
task
Search any
task