Skip to content

kbountro/pSAX

Repository files navigation

Kernel-based probabilistic SAX (pSAX)

MATLAB implementation of the pSAX time-series symbolic representation. Tested with MATLAB versions R2018b and R2019a.

Table of Contents

  1. Introduction
  2. Modules Description
  3. Datasets
  4. Installation and Execution Instructions

Introduction

The pSAX (Kernel-based Probabilistic SAX) [1], [2] method is an extension of the well-known SAX [3] (Symbolic Aggregate Approximation) for time-series dimensionality reduction. The main contribution of the method is a SAX-based representation that adapts directly to the underlying probability distribution of the time-series data, thus providing a more accurate symbolic approximation. The accuracy has been measured and compared to the conventional SAX with the (significant for databases performance) Tightness of Lower Bound metric, and also with the Mean Squared Error.

Files Description

This project consists of the following components:

  • pSAX, pSAX_overlap: Main functions, use whichever suits your application. The pSAX.m transforms the dataset with non-overlapping windows, whereas pSAX_overlap.m transforms every possible subsequence (even overlapping) separately.
  • tsPAA: (c) 2003, E. Keogh, J. Lin, S. Lonardi, P. Patel, L. Wei. Time-series to PAA approximation. Original file with minor modifications.
  • timeseries2symbol: (c) 2003, E. Keogh, J. Lin, S. Lonardi, P. Patel, L. Wei. Computes SAX representation of the data. Original file with minor modifications.
  • mvksdensity, statskcompute, statskernelinfo: (c) 2015-2016 The MathWorks, Inc. These are MATLAB's source files. They are called from the built-in function 'ksdensity'. We tweaked them to i) allow to estimate arbitrarily large number of density points (it was limited to 100 before) and ii) to fix the optimal smoothness parameter estimation for the Epanechnikov kernel, as it was set for the Gaussian kernel only. See https://www.mathworks.com/help/stats/ksdensity.html for more info.
  • lloydmax: Lloyd-Max quantizer. Quantize according to a probability density function.
  • k-means++: The k-means++ algorithm for initialization of k-means. Taken from the k-means file of Laurent S.: (https://www.mathworks.com/matlabcentral/fileexchange/28804-k-means), version 1.7.0.0

Datasets

A large collection of datasets is available at https://www.cs.ucr.edu/~eamonn/iSAX/iSAX.html

Installation and Execution Instructions

  1. Download the project's source files.
  2. Export as they are to a single folder.
  3. Call either pSAX.m or pSAX_overlap.m with the appropriate inputs.

References

[1] K. Bountrogiannis, G. Tzagkarakis and P. Tsakalides, "Data-driven Kernel-based Probabilistic SAX for Time Series Dimensionality Reduction," 2020 28th European Signal Processing Conference (EUSIPCO), Amsterdam, pp. 2343-2347, 2021, doi: 10.23919/Eusipco47968.2020.9287311.

[2] K. Bountrogiannis, G. Tzagkarakis and P. Tsakalides, "Distribution Agnostic Symbolic Representations for Time Series Dimensionality Reduction and Online Anomaly Detection," in IEEE Transactions on Knowledge and Data Engineering, doi: 10.1109/TKDE.2022.3174630.

[3] J. Lin et al., “Experiencing SAX: A novel symbolic representation of time series”, Data Min. Knowl. Disc., vol. 15, no. 2, pp. 107–144, 2007, doi: 10.1007/s10618-007-0064-z

License

This code is released under GPL v.3.0. If you use this code for academic works, please cite at least one of the following publications:

K. Bountrogiannis, G. Tzagkarakis and P. Tsakalides, "Data-driven Kernel-based Probabilistic SAX for Time Series Dimensionality Reduction," 2020 28th European Signal Processing Conference (EUSIPCO), Amsterdam, 2021, pp. 2343-2347, doi: 10.23919/Eusipco47968.2020.9287311.

K. Bountrogiannis, G. Tzagkarakis and P. Tsakalides, "Distribution Agnostic Symbolic Representations for Time Series Dimensionality Reduction and Online Anomaly Detection," in IEEE Transactions on Knowledge and Data Engineering, doi: 10.1109/TKDE.2022.3174630.

About

pSAX source code

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages