Need help with understanding gradient computation implementation in backward call in _inv_quad_logdet.py #2104

srinathdama · 2022-08-18T23:31:11Z

srinathdama
Aug 18, 2022

I want to implement a custom LazyTensor class with _quad_form_derivative() method for efficient gradient computations. In the documentation, it is mentioned that _quad_form_derivative() should implement the derivative of a quadratic form with the LazyTensor (e.g. $d (a^T X b) / dX$). After going through the code, I figured it out _quad_form_derivative() is called during backpropagation to compute the gradients w.r.t covariance matrix, i.e matrix_arg_grads. From my understanding, gradients w.r.t covariance matrix should contain two terms, one from inverse quadratic term and another term from log determinant, as given in the below derivation.

where $L$ is the negative log marginal likelihood, $f$ is inverse quadratic term, $\mathbf{y}^T \mathbf{K}^{-1} \mathbf{y}$, and $g$ is log determinant, $\log | \mathbf{K} |$.

However, the current implementation of _quad_form_derivative() for non_lazy_tensor computes one term, $ab^T$, instead of the sum of two terms given in the above equation. Please let me know whether I am doing something wrong in my derivation. Even an explanation about how the gradients w.r.t covariance matrix, i.e matrix_arg_grads, are being computed would be helpful.

Thanks in advance!

Answered by gpleiss

Aug 19, 2022

However, the current implementation of _quad_form_derivative() for non_lazy_tensor computes one term ab^T, instead of the sum of two terms given in the above equation. Please let me know whether I am doing something wrong in my derivation. Even an explanation about how the gradients w.r.t covariance matrix, i.e matrix_arg_grads, are being computed would be helpful.

This is correct. The _quad_form_derivative method (and the whole LazyTensor abstraction) is designed to be as abstract as possible. This is because we want to be able to backpropagate through functions other than the log marginal likelihood of the GP (for example, we want to back propagate through predictions made with lazy t…

View full answer

gpleiss · 2022-08-19T18:51:29Z

gpleiss
Aug 19, 2022
Maintainer

However, the current implementation of _quad_form_derivative() for non_lazy_tensor computes one term ab^T, instead of the sum of two terms given in the above equation. Please let me know whether I am doing something wrong in my derivation. Even an explanation about how the gradients w.r.t covariance matrix, i.e matrix_arg_grads, are being computed would be helpful.

This is correct. The _quad_form_derivative method (and the whole LazyTensor abstraction) is designed to be as abstract as possible. This is because we want to be able to backpropagate through functions other than the log marginal likelihood of the GP (for example, we want to back propagate through predictions made with lazy tensors, lazy tensor samples, etc.). Both the inv_quad and the log_det derivatives can be written in a way so that, given an arbitrary procedure to compute d(a^T K b)/dK, I can product d( y^T K^{-1} y )/dK and d log|K| / dK.

3 replies

srinathdama Aug 19, 2022
Author

Both the inv_quad and the log_det derivatives can be written in a way so that, given an arbitrary procedure to compute d(a^T K b)/dK, I can product d( y^T K^{-1} y )/dK and d log|K| / dK.

This is what I am actually looking for! Can you elaborate or give me some hints on how to rewrite gradients w.r.t covariance matrix, i.e matrix_arg_grads as a product of d( y^T K^{-1} y )/dK and d log|K| / dK? Actually, I was reading your paper and trying to connect Equation 4 with gradient computation implemented in backward call in InvQuadLogdet.

gpleiss Aug 19, 2022
Maintainer

Section 3.4.1 of my thesis describes it better than the original paper: https://geoffpleiss.com/static/media/gpleiss_thesis.d218bc00.pdf#page84

srinathdama Aug 20, 2022
Author

thank you very much! section 3.4.1 cleared my doubts with the gradient computation!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Need help with understanding gradient computation implementation in backward call in _inv_quad_logdet.py #2104

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment 3 replies

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Need help with understanding gradient computation implementation in backward call in _inv_quad_logdet.py #2104

Uh oh!

srinathdama Aug 18, 2022

Replies: 1 comment · 3 replies

Uh oh!

gpleiss Aug 19, 2022 Maintainer

Uh oh!

srinathdama Aug 19, 2022 Author

Uh oh!

gpleiss Aug 19, 2022 Maintainer

Uh oh!

srinathdama Aug 20, 2022 Author

srinathdama
Aug 18, 2022

Replies: 1 comment 3 replies

gpleiss
Aug 19, 2022
Maintainer

srinathdama Aug 19, 2022
Author

gpleiss Aug 19, 2022
Maintainer

srinathdama Aug 20, 2022
Author