Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

LLM Arbitration on PAVE Dimension 2: Temporal Setting v1 (test)

94.81CR (KU)

Llama3-8B

21.573240.586659.678.6134May 31, 2026
Updated 1d ago

Evaluation Results

MethodLinks
2026.05
94.8135.749.4637.798.8561.5838.3762.31002.67
2026.05
90.972.179.98979.3979.3816.363.8520.6110011.47
5446.561.52486.2268.259.49.2313.7810044.04
46.5823.710.87250.9168.2533.52.1549.0910042.78
2026.05
28.6513.890.40248.4824.3512.550.32251.5210073.57
2026.05
27.89.140.38532.8840.527.180.68167.1210063.68
2026.05
24.3923.130.32394.8527.451.410.3785.1510075.45