Implement variance reduction in SLQ logdet backward pass. #1836

gpleiss · 2021-11-22T15:52:53Z

Based on "Reducing the Variance of Gaussian Process Hyperparameter Optimization with Preconditioning" by Wenger et al., 2021.

When using iterative methods (i.e. CG/SLQ) to compute the log determinant, the forward pass currently computes:
logdet K \approx logdet P + SLQ( P^{-1/2} K P^{-1/2} ),
where P is a preconditioner, and SLQ is a stochastic estimate of the log determinant. If the preconditioner is a good approximation of K, then this forward pass can be seen as a form of variance reduction.

In this PR, we apply this same variance reduction strategy to the backward pass. We compute the backward pass as:
d logdet(K)/dtheta \approx d logdet(P)/dtheta + d SLQ/dtheta

TODOs:

Implement pivoted cholesky as a torch.autograd.Function, so that we can compute backward passes through it.
Redo inv_quad_logdet function to apply variance reduction in the forward and backward passes.

gpleiss · 2021-12-14T18:20:27Z

@jacobrgardner @JonathanWenger ready for review

gpleiss · 2021-12-14T18:49:53Z

(actually ready for review now. I just fixed broken tests.)

gpytorch/functions/_inv_quad_logdet.py

gpytorch/lazy/identity_lazy_tensor.py

gpytorch/lazy/added_diag_lazy_tensor.py

jacobrgardner · 2021-12-15T13:43:43Z

@gpleiss something seems off about how this is computing the preconditioner log determinant now. We're still computing it efficiently using the QR decomposition in the init_cache methods on AddedDiagLT, and then it looks like we discard that efficient computation and call logdet on precond_lt which could be as bad as calling Cholesky on the full n x n preconditioner matrix, right? PsdSumLazyTensor doesn't override logdet?

Entirely possible I just missed the relevant code here...

jacobrgardner · 2021-12-15T13:46:35Z

Ideally, this would be the logdet value we'd return in the forward pass:

gpytorch/gpytorch/lazy/added_diag_lazy_tensor.py

Line 147 in 70b79c2

    
           self._precond_logdet_cache = logdet.view(*batch_shape) if len(batch_shape) else logdet.squeeze()

This reverts commit d2bff48.

jacobrgardner

Looks solid to me now 👍

gpleiss · 2021-12-21T00:52:46Z

I just profiled this PR on the KeOps example notebook - just to double check. It is just as fast as what is on is on master.

gpleiss force-pushed the piv_chol_func3 branch from a144a73 to d542553 Compare November 22, 2021 16:53

wjmaddox mentioned this pull request Dec 13, 2021

expose inducing point init method pytorch/botorch#1023

Closed

gpleiss force-pushed the piv_chol_func3 branch from d542553 to 475ca01 Compare December 14, 2021 18:19

gpleiss marked this pull request as ready for review December 14, 2021 18:20

gpleiss requested a review from jacobrgardner December 14, 2021 18:20

gpleiss force-pushed the piv_chol_func3 branch from 475ca01 to 70b79c2 Compare December 14, 2021 18:45

gpleiss changed the title ~~[WIP] Implement variance reduction in SLQ logdet backward pass.~~ Implement variance reduction in SLQ logdet backward pass. Dec 14, 2021