Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Response Similarity on ShareGPT, NUMINA, and ArenaHard Average (test)

5RSE

GPT4o-0806

3.573123.943564.3144.68444Jan 22, 2025
Updated 3mo ago

Evaluation Results

MethodLinks
2025.01
50.9950.995
2025.01
4.240.2690.841
2025.01
4.1740.2520.838
2025.01
4.1410.250.838
4.1020.220.837
2025.01
4.0450.2330.836
2025.01
4.0450.2770.839
3.8980.2440.837
3.880.1640.787
3.740.1890.823
2025.01
3.720.2160.823
2025.01
3.6280.2130.828