Awesome work!
I am working with a time-series data for regression (feature extraction/representation learning) task, where
$$X \in R^{S \times L}, Y \in R^{S}$$
and we model $f_\theta(X)=Y$.
I wonder is it much similar to TSLANet classification variants and is it feature to quick adapt model architecture from classification to this regression task by simply replace the final out projection layer.