Share your thoughts, 1 month free Claude Pro on usSee more

Causality Discovery OOD Evaluation

100Success Rate

GPT-4.1-2025-04-14

Updated 5mo ago

Evaluation Results

Method	Links
GPT-4.1-2025-04-14 2025.09		100	35.9
Mistral-Small-3.1-24B-Ins 2025.09		100	25.8
Qwen2.5-Instruct-7B 2025.09		100	26.3
Llama-3.1-70B-Instruct 2025.09		99.9	28.9
TIMEOMNI-1 2025.09		99.8	64
GPT-4.1-Nano 2025.09		98.4	28
Mistral-7B-v0.3 2025.09		82.6	26.9
Time-MQA 2025.09		52.2	4
Time-R1 2025.09		48.9	31.4
Time-MQA 2025.09		37.2	31.2
Time-MQA 2025.09		32	30.5
ChatTS 2025.09		26.7	18.6
Llama-3.1-8B-Instruct 2025.09		1.9	-