gpytorch.lazy

LazyTensor

class gpytorch.lazy.LazyTensor(*args, **kwargs)[source]

Base class for LazyTensors in GPyTorch.

In GPyTorch, nearly all covariance matrices for Gaussian processes are handled internally as some variety of LazyTensor. A LazyTensor is an object that represents a tensor object, similar to torch.tensor, but typically differs in two ways:

  1. A tensor represented by a LazyTensor can typically be represented more efficiently than storing a full matrix. For example, a LazyTensor representing \(K=XX^{\top}\) where \(K\) is \(n \times n\) but \(X\) is \(n \times d\) might store \(X\) instead of \(K\) directly.
  2. A LazyTensor typically defines a matmul routine that performs \(KM\) that is more efficient than storing the full matrix. Using the above example, performing \(KM=X(X^{\top}M)\) requires only \(O(nd)\) time, rather than the \(O(n^2)\) time required if we were storing \(K\) directly.

In order to define a new LazyTensor class that can be used as a covariance matrix in GPyTorch, a user must define at a minimum the following methods (in each example, \(K\) denotes the matrix that the LazyTensor represents)

  • _get_indices(), which returns a Tensor where the entries are determined by LongTensors of indices.
  • _matmul(), which performs a matrix multiplication \(KM\)
  • _quad_form_derivative(), which computes a quadratic form with the derivative, \(\mathbf{v}^{\top}\frac{dK}{dR}\mathbf{v}\), where \(R\) denotes the actual tensors used to represent \(K\). In the linear kernel example, \(K=XX^{\top}\), this would be \(\frac{dK}{dX}\). If \(K\) is a Toeplitz matrix (see gpytorch.lazy.ToeplitzLazyTensor) represented by its first column \(\mathbf{c}\), this would return \(\mathbf{v}^{\top}\frac{dK}{d\mathbf{c}}\mathbf{v}\).
  • _size(), which returns a torch.Size containing the dimensions of \(K\).
  • _transpose_nonbatch(), which returns a transposed version of the LazyTensor

In addition to these, a LazyTensor may need to define the _transpose_nonbatch(), _get_indices(), and _get_indices() functions in special cases. See the documentation for these methods for details.

Note

The base LazyTensor class provides default implementations of many other operations in order to mimic the behavior of a standard tensor as closely as possible. For example, we provide default implementations of __getitem__(), __add__(), etc that either make use of other lazy tensors or exploit the functions that must be defined above.

While these implementations are provided for convenience, it is advisable in many cases to override them for the sake of efficiency.

Note

LazyTensors are designed by default to optionally represent batches of matrices. Thus, the size of a LazyTensor may be (for example) \(b imes n imes n\). Many of the methods are designed to efficiently operate on these batches if present.

add_diag(diag)[source]

Adds an element to the diagonal of the matrix.

Args:
  • diag (Scalar Tensor)
add_jitter(jitter_val=0.001)[source]

Adds jitter (i.e., a small diagonal component) to the matrix this LazyTensor represents. This could potentially be implemented as a no-op, however this could lead to numerical instabilities, so this should only be done at the user’s risk.

batch_dim

Returns the dimension of the shape over which the tensor is batched.

batch_shape

Returns the shape over which the tensor is batched.

clone()[source]

Clones the LazyTensor (creates clones of all underlying tensors)

cpu()[source]
Returns:
LazyTensor: a new LazyTensor identical to self, but on the CPU.
cuda(device_id=None)[source]

This method operates identically to torch.nn.Module.cuda().

Args:
device_id (str, optional):
Device ID of GPU to use.
Returns:
LazyTensor:
a new LazyTensor identical to self, but on the GPU.
detach()[source]

Removes the LazyTensor from the current computation graph. (In practice, this function removes all Tensors that make up the LazyTensor from the computation graph.)

detach_()[source]

An in-place version of detach.

diag()[source]

As torch.diag(), returns the diagonal of the matrix \(K\) this LazyTensor represents as a vector.

Returns:
torch.tensor: The diagonal of \(K\). If \(K\) is \(n imes n\), this will be a length n vector. If this LazyTensor represents a batch (e.g., is \(b imes n imes n\)), this will be a \(b imes n\) matrix of diagonals, one for each matrix in the batch.
dim()[source]

Alias of ndimension()

evaluate()[source]

Explicitly evaluates the matrix this LazyTensor represents. This function should return a Tensor storing an exact representation of this LazyTensor.

evaluate_kernel()[source]

Return a new LazyTensor representing the same one as this one, but with all lazily evaluated kernels actually evaluated.

inv_matmul(tensor)[source]

Computes a linear solve (w.r.t self = \(K\)) with several right hand sides \(M\).

Args:
  • torch.tensor (n x k) - Matrix \(M\) right hand sides
Returns:
  • torch.tensor - \(K^{-1}M\)
inv_quad(tensor)[source]

Computes an inverse quadratic form (w.r.t self) with several right hand sides. I.e. computes tr( tensor^T self^{-1} tensor )

NOTE: Don’t overwrite this function! Instead, overwrite inv_quad_log_det

Args:
  • tensor (tensor nxk) - Vector (or matrix) for inverse quad
Returns:
  • tensor - tr( tensor^T (self)^{-1} tensor )
inv_quad_log_det(inv_quad_rhs=None, log_det=False, reduce_inv_quad=True)[source]

Computes an inverse quadratic form (w.r.t self) with several right hand sides. I.e. computes tr( tensor^T self^{-1} tensor ) In addition, computes an (approximate) log determinant of the the matrix

Args:
  • tensor (tensor nxk) - Vector (or matrix) for inverse quad
Returns:
  • scalar - tr( tensor^T (self)^{-1} tensor )
  • scalar - log determinant
log_det()[source]

Computes an (approximate) log determinant of the matrix

NOTE: Don’t overwrite this function! Instead, overwrite inv_quad_log_det

Returns:
  • scalar: log determinant
matmul(other)[source]

Multiplies self by a matrix

Args:
other (torch.tensor): Matrix or vector to multiply with. Can be either a torch.tensor
or a gpytorch.lazy.LazyTensor.
Returns:
torch.tensor: Tensor or LazyTensor containing the result of the matrix multiplication \(KM\), where \(K\) is the (batched) matrix that this gpytorch.lazy.LazyTensor represents, and \(M\) is the (batched) matrix input to this method.
matrix_shape

Returns the shape of the matrix being represented (without batching).

mul(other)[source]

Multiplies the matrix by a constant, or elementwise the matrix by another matrix

Args:
other (torch.tensor or LazyTensor): constant or matrix to elementwise multiply by.
Returns:
gpytorch.lazy.LazyTensor: Another lazy tensor representing the result of the multiplication. if other was a constant (or batch of constants), this will likely be a gpytorch.lazy.ConstantMulLazyTensor. If other was another matrix, this will likely be a gpytorch.lazy.MulLazyTensor.
mul_batch(mul_batch_size=None)[source]

For a b x n x m LazyTensor, compute the product over the batch dimension.

The mul_batch_size controls whether or not the batch dimension is grouped when multiplying.
  • mul_batch_size=None (default): The entire batch dimension is multiplied. Returns a n x n LazyTensor.
  • mul_batch_size=k: Creates b/k groups, and muls the k entries of this group.
    (The LazyTensor is reshaped as a b/k x k x n x m LazyTensor and the k dimension is multiplied over. Returns a b/k x n x m LazyTensor.
Args:
mul_batch_size (int or None):
Controls the number of groups that are multiplied over (default: None).
Returns:
LazyTensor
Example:
>>> lazy_tensor = gpytorch.lazy.NonLazyTensor(torch.tensor([
        [[2, 4], [1, 2]],
        [[1, 1], [0, -1]],
        [[2, 1], [1, 0]],
        [[3, 2], [2, -1]],
    ]))
>>> lazy_tensor.mul_batch().evaluate()
>>> # Returns: torch.Tensor([[12, 8], [0, 0]])
>>> lazy_tensor.mul_batch(mul_batch_size=2)
>>> # Returns: torch.Tensor([[[2, 4], [0, -2]], [[6, 2], [2, 0]]])
ndimension()[source]

Returns the number of dimensions

numel()[source]

Returns the number of elements

repeat(*sizes)[source]

Repeats this tensor along the specified dimensions.

Currently, this only works to create repeated batches of a 2D LazyTensor. I.e. all calls should be lazy_tensor.repeat(<size>, 1, 1).

Example:
>>> lazy_tensor = gpytorch.lazy.ToeplitzLazyTensor(torch.tensor([4. 1., 0.5]))
>>> lazy_tensor.repeat(2, 1, 1).evaluate()
tensor([[[4.0000, 1.0000, 0.5000],
         [1.0000, 4.0000, 1.0000],
         [0.5000, 1.0000, 4.0000]],
        [[4.0000, 1.0000, 0.5000],
         [1.0000, 4.0000, 1.0000],
         [0.5000, 1.0000, 4.0000]]])
representation()[source]

Returns the Tensors that are used to define the LazyTensor

representation_tree()[source]

Returns a gpytorch.lazy.LazyTensorRepresentationTree tree object that recursively encodes the representation of this lazy tensor. In particular, if the definition of this lazy tensor depends on other lazy tensors, the tree is an object that can be used to reconstruct the full structure of this lazy tensor, including all subobjects. This is used internally.

requires_grad_(val)[source]

Sets requires_grad=val on all the Tensors that make up the LazyTensor This is an inplace operation.

root_decomposition()[source]

Returns a (usually low-rank) root decomposotion lazy tensor of a PSD matrix. This can be used for sampling from a Gaussian distribution, or for obtaining a low-rank version of a matrix

root_decomposition_size()[source]

This is the inner size of the root decomposition. This is primarily used to determine if it will be cheaper to compute a different root or not

root_inv_decomposition(initial_vectors=None, test_vectors=None)[source]

Returns a (usually low-rank) root decomposotion lazy tensor of a PSD matrix. This can be used for sampling from a Gaussian distribution, or for obtaining a low-rank version of a matrix

size(val=None)[source]

Returns the size of the resulting Tensor that the lazy tensor represents

sum_batch(sum_batch_size=None)[source]

Sum the b x n x m LazyTensor over the batch dimension.

The sum_batch_size controls whether or not the batch dimension is grouped when summing.
  • sum_batch_size=None (default): The entire batch dimension is summed. Returns a n x n LazyTensor.
  • sum_batch_size=k: Creates b/k groups, and sums the k entries of this group.
    (The LazyTensor is reshaped as a b/k x k x n x m LazyTensor and the k dimension is summed over. Returns a b/k x n x m LazyTensor.
Args:
sum_batch_size (int or None):
Controls the number of groups that are summed over (default: None).
Returns:
LazyTensor
Example:
>>> lazy_tensor = gpytorch.lazy.NonLazyTensor(torch.tensor([
        [[2, 4], [1, 2]],
        [[1, 1], [0, -1]],
        [[2, 1], [1, 0]],
        [[3, 2], [2, -1]],
    ]))
>>> lazy_tensor.sum_batch().evaluate()
>>> # Returns: torch.Tensor([[8, 8], [4, 0]])
>>> lazy_tensor.sum_batch(sum_batch_size=2)
>>> # Returns: torch.Tensor([[[3, 5], [1, 1]], [[5, 3], [3, -1]]])
t()[source]

Alias of transpose() for 2D LazyTensor. (Tranposes the two dimensions.)

to(device_id)[source]

A device-agnostic method of moving the lazy_tensor to the specified device.

Args:
device_id (:obj: torch.device): Which device to use (GPU or CPU).
Returns:
LazyTensor: New LazyTensor identical to self on specified device
transpose(dim1, dim2)[source]

Transpose the dimensions dim1 and dim2 of the LazyTensor.

Example:
>>> lazy_tensor = gpytorch.lazy.NonLazyTensor(torch.randn(3, 5))
>>> lazy_tensor.transpose(0, 1)
zero_mean_mvn_samples(num_samples)[source]

Assumes that self is a covariance matrix, or a batch of covariance matrices. Returns samples from a zero-mean MVN, defined by self (as covariance matrix)

Self should be symmetric, either (batch_size x num_dim x num_dim) or (num_dim x num_dim)

Args:
num_samples (int):
Number of samples to draw.
Returns:
torch.tensor:
Samples from MVN (num_samples x batch_size x num_dim) or (num_samples x num_dim)
class gpytorch.lazy.BlockLazyTensor(base_lazy_tensor, num_blocks=None)[source]

An abstract LazyTensor class for block tensors. Super classes will determine how the different blocks are layed out (e.g. block diagonal, sum over blocks, etc.)

BlockLazyTensors represent the groups of blocks as a batched Tensor. For example, a k x n x n tensor represents k n x n blocks.

For a batched block tensor, the batch dimension is used to represent the actual (“true”) batches as well as the different blocks. For example, k b x n x n blocks would be represented as a bk x n x n Tensor, where the “outer” batch dimension represents the true batch dimension (i.e. - the Tensor could be viewed as a b x k x n x n Tensor without re-ordering).

For batch mode, the num_blocks attribute specifes the number of blocks (to differentiate from true batches). This attribute should be None for non-batched Tensors.

Args:
  • base_lazy_tensor (LazyTensor):
    A k x n x n LazyTensor, or a bk x n x n LazyTensor, representing k blocks.
  • num_blocks (int or None):
    Set this to k for bk x n x n batched LazyTensors, or None for k x n x n unbatched LazyTensors.

Kernel LazyTensors

class gpytorch.lazy.LazyEvaluatedKernelTensor(kernel, x1, x2, batch_dims=None, squeeze_row=False, squeeze_col=False, **params)[source]
diag()[source]

Getting the diagonal of a kernel can be handled more efficiently by transposing the batch and data dimension before calling the kernel. Implementing it this way allows us to compute predictions more efficiently in cases where only the variances are required.

evaluate_kernel()[source]

NB: This is a meta LazyTensor, in the sense that evaluate can return a LazyTensor if the kernel being evaluated does so.

Structured LazyTensors

BlockDiagLazyTensor

class gpytorch.lazy.BlockDiagLazyTensor(base_lazy_tensor, num_blocks=None)[source]

Represents a lazy tensor that is the block diagonal of square matrices. For example, a k x n x n tensor represents k n x n blocks. Therefore, all the block diagonal components must be the same lazy tensor type and size.

For a BlockDiagLazyTensor in batch mode, the batch dimension is used to represent the actual (“true”) batches as well as the different blocks. For example, k b x n x n blocks would be represented as a bk x n x n Tensor, where the “outer” batch dimension represents the true batch dimension (i.e. - the Tensor could be viewed as a b x k x n x n Tensor without re-ordering).

For batch mode, the num_blocks attribute specifes the number of blocks (to differentiate from true batches). This attribute should be None for non-batched Tensors.

Args:
base_lazy_tensor (LazyTensor):
A k x n x n LazyTensor, or a bk x n x n LazyTensor, representing k blocks.
num_blocks (int or None):
Set this to k for bk x n x n batched LazyTensors, or None for k x n x n unbatched LazyTensors.

CholLazyTensor

class gpytorch.lazy.CholLazyTensor(chol)[source]

DiagLazyTensor

class gpytorch.lazy.DiagLazyTensor(diag)[source]

MatmulLazyTensor

class gpytorch.lazy.MatmulLazyTensor(left_lazy_tensor, right_lazy_tensor)[source]

RootLazyTensor

class gpytorch.lazy.RootLazyTensor(root)[source]

NonLazyTensor

class gpytorch.lazy.NonLazyTensor(tsr)[source]

ToeplitzLazyTensor

class gpytorch.lazy.ToeplitzLazyTensor(column)[source]
diag()[source]

Gets the diagonal of the Toeplitz matrix wrapped by this object.

ZeroLazyTensor

class gpytorch.lazy.ZeroLazyTensor(*sizes, dtype=None, device=None)[source]

Special LazyTensor representing zero.

Composition/Decoration LazyTensors

AddedDiagLazyTensor

class gpytorch.lazy.AddedDiagLazyTensor(*lazy_tensors)[source]

A SumLazyTensor, but of only two lazy tensors, the second of which must be a DiagLazyTensor.

ConstantMulLazyTensor

class gpytorch.lazy.ConstantMulLazyTensor(base_lazy_tensor, constant)[source]

A LazyTensor that multiplies a base LazyTensor by a scalar constant:

` constant_mul_lazy_tensor = constant * base_lazy_tensor `

Note

To element-wise multiply two lazy tensors, see gpytorch.lazy.MulLazyTensor

Args:
base_lazy_tensor (LazyTensor) or (b x n x m)): The base_lazy tensor constant (Tensor): The constant

If base_lazy_tensor represents a matrix (non-batch), then constant must be a 0D tensor, or a 1D tensor with one element.

If base_lazy_tensor represents a batch of matrices (b x m x n), then constant can be either: - A 0D tensor - the same constant is applied to all matrices in the batch - A 1D tensor with one element - the same constant is applied to all matrices - A 1D tensor with b elements - a different constant is applied to each matrix

Example:

>>> base_base_lazy_tensor = gpytorch.lazy.ToeplitzLazyTensor([1, 2, 3])
>>> constant = torch.tensor(1.2)
>>> new_base_lazy_tensor = gpytorch.lazy.ConstantMulLazyTensor(base_base_lazy_tensor, constant)
>>> new_base_lazy_tensor.evaluate()
>>> # Returns:
>>> # [[ 1.2, 2.4, 3.6 ]
>>> #  [ 2.4, 1.2, 2.4 ]
>>> #  [ 3.6, 2.4, 1.2 ]]
>>>
>>> base_base_lazy_tensor = gpytorch.lazy.ToeplitzLazyTensor([[1, 2, 3], [2, 3, 4]])
>>> constant = torch.tensor([1.2, 0.5])
>>> new_base_lazy_tensor = gpytorch.lazy.ConstantMulLazyTensor(base_base_lazy_tensor, constant)
>>> new_base_lazy_tensor.evaluate()
>>> # Returns:
>>> # [[[ 1.2, 2.4, 3.6 ]
>>> #   [ 2.4, 1.2, 2.4 ]
>>> #   [ 3.6, 2.4, 1.2 ]]
>>> #  [[ 1, 1.5, 2 ]
>>> #   [ 1.5, 1, 1.5 ]
>>> #   [ 2, 1.5, 1 ]]]

InterpolatedLazyTensor

class gpytorch.lazy.InterpolatedLazyTensor(base_lazy_tensor, left_interp_indices=None, left_interp_values=None, right_interp_indices=None, right_interp_values=None)[source]

KroneckerProductLazyTensor

class gpytorch.lazy.KroneckerProductLazyTensor(*lazy_tensors)[source]

MulLazyTensor

class gpytorch.lazy.MulLazyTensor(*lazy_tensors)[source]
representation()[source]

Returns the Tensors that are used to define the LazyTensor

PsdSumLazyTensor

class gpytorch.lazy.PsdSumLazyTensor(*lazy_tensors)[source]

A SumLazyTensor, but where every component of the sum is positive semi-definite

SumBatchLazyTensor

class gpytorch.lazy.SumBatchLazyTensor(base_lazy_tensor, num_blocks=None)[source]

Represents a lazy tensor that is actually the sum of several lazy tensors blocks. For example, a k x n x n tensor represents k n x n blocks. Therefore, all the block diagonal components must be the same lazy tensor type and size.

For a SumBatchLazyTensor in batch mode, the batch dimension is used to represent the actual (“true”) batches as well as the different blocks. For example, k b x n x n blocks would be represented as a bk x n x n Tensor, where the “outer” batch dimension represents the true batch dimension (i.e. - the Tensor could be viewed as a b x k x n x n Tensor without re-ordering).

For batch mode, the groups attribute specifes the number of blocks (to differentiate from true batches). This attribute should be None for non-batched Tensors.

Args:
base_lazy_tensor (LazyTensor):
A k x n x n LazyTensor, or a bk x n x n LazyTensor, representing k blocks.
groups (int or None):
Set this to k for bk x n x n batched LazyTensors, or None for k x n x n unbatched LazyTensors.