Randomized trace estimates for indefinite matrices with an application to determinants
- Daniel Kressner (EPF Lausanne)
Abstract
The need for estimating the determinant of a large-scale symmetric positive definite (SPD) matrix A features prominently in several machine learning tasks, most notably in maximum likelihood estimation for Gaussian process regression. Because of log(det(A)) = trace( log(A) ), estimating the log-determinant is equivalent to estimating the trace of the matrix logarithm. This simple relation allows for the application of Hutchinson's randomized trace estimator, which approximates the trace of a symmetric matrix B by computing the average of x'*B*x for many samples of a random vector x. When estimating determinants, two major obstacles arise: (1) The matrix B=log(A) can be indefinite even when A is SPD. This complicates the analysis; nearly all existing tail bounds only apply to SPD matrices. (2) The exact evaluation of x'*log(A)*x requires the computation of the matrix logarithm, which is expensive. Recent work by Ubaru, Chen, and Saad considers the use of Rademacher random vectors and addresses (1) by shifting the matrix and (2) by using Lanczos quadrature. In this talk, we will explain why the tail bounds obtained from shifting are overly pessimistic. A new bound is presented, which reduces the required number of samples by up to a factor n, where n is the size of the matrix. We will also extend these results to Gaussian random vectors, which requires some careful consideration for the Lanczos quadrature because of the unboundedness of such vectors.
This talk is based on joint work with Alice Cortinovis.