Share your thoughts, 1 month free Claude Pro on usSee more

Counterfactual Generation on AI-READI (Class 1)

98Validity

Llama*

Updated 3mo ago

Evaluation Results

Method	Links
Llama* 2026.01		98	20	1.9	99
GPT-4 2026.01		92	182	4	96
BioMistral* 2026.01		91	100	2.1	95
GPT-4 2026.01		89	150	3.8	82
CFNOW 2026.01		84	25	3	99
Llama 2026.01		68	130	3.8	78
DiCE 2026.01		58	41	2.4	99
NICE 2026.01		53	4	1.31	35
BioMistral 2026.01		47	150	4.1	70