Online Estimation of the Inverse Covariance Matrix

Online Estimation of the Inverse Covariance Matrix

This post is part 5 of a series of five articles:

  1. Online Maximum Likelihood Estimation of (multivariate) Gaussian Distributions
  2. Online Estimation of Weighted Sample Mean and Coviarance Matrix
  3. The Covariance of weighted Means
  4. Memory of the exponentially decaying Estimator for Mean and Covariance Matrix
  5. Online Estimation of the Inverse Covariance Matrix

\(
\def\myX{\mathbf{x}} \def\myM{\mathbf{M}} \def\myY{\mathbf{y}} \def\mySigma{\mathbf{\Sigma}} \def\myM{\mathbf{M}} \def\myT{\mathsf{T}} \)

In practice, it is often required to estimate the inverse of the covariance matrix of a Gaussian distribution, for example, when only a Mahalanobis distance between points has to be computed. In an online setting it can be quite expensive to compute the inverse of the covariance matrix again for each new point arriving, since the complexity of the matrix inverse is . As we will see in the following, it is not necessary to estimate the covariance matrix, if only its inverse is needed and furthermore, the inverse can be adapted incrementally in an efficient manner. The results are shown for the general case with a weighted exponentially decaying estimator.

Remember that the weighted exponentially decaying estimator could be incrementally adapt the coviarance matrix using the following relations:

Now, in order to compute the inverse, we simply apply the Sherman-Morrison formula [1] – a special case of the Woodbury matrix identity [2] – to incrementally update . The formula is given by

\begin{align} (A+uv^\mathsf{T})^{-1}= A^{-1} - \frac{A^{-1}uv^\myT A^{-1}}{1+v^\myT A^{-1}u}. \end{align}

If we look at Eq. \eqref{eq:M}, we can identify:

This then leads to:

With Eq. \eqref{eq:Sigma}, we can finally compute the inverse of the covariance matrix with

Example Code

In the following some R-code is listed, which illustrates the procedure to incrementally estimate the inverse of the covariance matrix for a set of points collected in the matrix .

library(MASS)
N = 100000
d <- 3 # dimension
covMat <- matrix(c(1,-1,0,-1,3,-1,0,-1,3), nrow=3)
X <- mvrnorm(n=N,Sigma = covMat,  mu = c(1,10,20), tol= 1e-13)

lambda <- 0.999
lambdaSum <- 0.0
muhat <- matrix(rep(0,d))
M <- .01*diag(d)
Mminus <- .01*diag(d)


for(i in 1:nrow(X)) {
  x=matrix(X[i,])
  lambdaSum <- lambda * lambdaSum + 1 # Divisor For Covariance Matrix
  delta <- x - muhat
  muhat <- muhat + delta / lambdaSum
  M <- lambda * M + delta %*% t(x - muhat)
  Sigma <- M / lambdaSum

  # Estimation of the inverse
  Mnum <- 1.0/lambda^2 * Mminus %*% delta %*% t(x - muhat) %*% Mminus
  Mden <- 1.0 + 1.0 / lambda * t(x - muhat) %*% Mminus %*% delta
  Mminus <- Mminus / lambda - Mnum / c(Mden)
}

print("Error between estimates (should be very small): ")
print(ginv(Sigma) - Mminus * lambdaSum)

print("Error between inverse of real and estimated covariance matrix (can be larger because of lambda): ")
print(ginv(covMat) - Mminus * lambdaSum)

If we run this script, we could get the following output:

[1] "Error between estimates (should be very small): "
             [,1]         [,2]         [,3]
[1,] 6.883383e-15 4.107825e-15 1.443290e-15
[2,] 2.664535e-15 3.108624e-15 7.494005e-16
[3,] 9.714451e-16 8.326673e-16 3.330669e-16
[1] "Error between inverse of real and estimated covariance matrix (can be larger because of lambda): "
             [,1]         [,2]         [,3]
[1,] -0.044986937 -0.024795319 -0.006808579
[2,] -0.024795319 -0.031976196 -0.001583689
[3,] -0.006808579 -0.001583689  0.008351127

References

  1. J. Sherman and W. J. Morrison, “Adjustment of an inverse matrix corresponding to a change in one element of a given matrix,” The Annals of Mathematical Statistics, vol. 21, no. 1, pp. 124–127, 1950.
  2. M. A. Woodbury, “Inverting modified matrices,” Princeton University, Princeton, NJ, Memorandum Report 42, Statistical Research Group, 1950.

Markus Thill

Markus Thill
I studied computer engineering (B.Sc.) and Automation & IT (M.Eng.). Generally, I am interested in machine learning (ML) approaches (in the broadest sense), but particularly in the fields of time series analysis, anomaly detection, Reinforcement Learning (e.g. for board games), Deep Learning (DL) and incremental (on-line) learning procedures.

Deriving a Closed-Form Solution of the Fibonacci Sequence using the Z-Transform

The Fibonacci sequence might be one of the most famous sequences in the field of mathmatics and computer science. Already high school stu...… Continue reading