Open
Description
My data dimensions are [B, N, D], the first dimension is batchsize, the second dimension is the sequence length in the sample, and the third dimension is the feature channel.
Before feeding into the Approximate Gaussian process, I flatten the first and second dimensions into [BN, D] and feed into the Gaussian process. The output of my Gaussian process is [BN, T], where T is the number of tasks. But is there a problem in this case? So I have 2 issues:
- Because the covariance matrix between all samples in a mini_batch is calculated, but in fact there is no relationship between each of my samples, and there is no need to calculate the covariance between different samples.
- Because of this problem, I can only perform a for loop according to the batch dimension and use a defined Gaussian process to process all samples one by one. Is this reasonable?