Why your model parameter confidences might be too optimistic -- unbiased estimation of the inverse covariance matrix

About

AIMS. The maximum-likelihood method is the standard approach to obtain model fits to observational data and the corresponding confidence regions. We investigate possible sources of bias in the log-likelihood function and its subsequent analysis, focusing on estimators of the inverse covariance matrix. Furthermore, we study under which circumstances the estimated covariance matrix is invertible. METHODS. We perform Monte-Carlo simulations to investigate the behaviour of estimators for the inverse covariance matrix, depending on the number of independent data sets and the number of variables of the data vectors. RESULTS. We find that the inverse of the maximum-likelihood estimator of the covariance is biased, the amount of bias depending on the ratio of the number of bins (data vector variables), P, to the number of data sets, N. This bias inevitably leads to an -- in extreme cases catastrophic -- underestimation of the size of confidence regions. We report on a method to remove this bias for the idealised case of Gaussian noise and statistically independent data vectors. Moreover, we demonstrate that marginalisation over parameters introduces a bias into the marginalised log-likelihood function. Measures of the sizes of confidence regions suffer from the same problem. Furthermore, we give an analytic proof for the fact that the estimated covariance matrix is singular if P>N.

J. Hartlap, P. Simon, P. Schneider• 2006

Related benchmarks

Task	Dataset	Result
Causal Discovery	Synthetic (n=100, \|E\|=400, sample size=1000)	mAP34.1	36
Causal Discovery	Synthetic n=1000, \|E\|=2000, sample size=1000	mAP46.7	32
Causal Discovery	SERGIO-GRN n=100, \|E\|=400, sample size=20000	mAP5.1	6
Causal Discovery	SERGIO-GRN n=200, \|E\|=400, sample size=20000	mAP1.2	6

Showing 4 of 4 rows

Other info

Follow for update

@wizwand_team Discord