Date: September 2024
Author: Francesco Rosnati, HPC Engineering Master's student at Politecnico di Milano
Course: Numerical Analysis for Machine Learning at Politecnico di Milano -- Professor Edie Miglio
This project explores the application of Compressed Sensing (CS) techniques to electrocardiogram (ECG) signal processing, specifically leveraging adaptive dictionary learning methods and the Kronecker technique for sparse signal recovery. In particular the goal is to explore soultions that would fit on portable remote Ecg machines which have low computational power and limited storage capacity.
Refer to the Report for a thorough review of the present study and it's results.
The goal of this project is to apply compressed sensing to ECG data and investigate the performance of adaptive dictionary learning (MOD and K-SVD) compared to fixed dictionaries (DCT), with and without the Kronecker technique. We aim to determine whether adaptive methods can outperform fixed dictionaries in terms of signal reconstruction quality. In particular it will be tested trying to simulate what would be possible to achieve with a remote portable Ecg machine, such apparatus limits the computational power in measurement phase and has linited torage capabilites. (Recovery is not limited because it doesn't have to happen on-chip, it can be done on more powerful machines in a second moment)
The present project is mainly ispired by the work "A compressed-sensing-based compressor for ECG." by Izadi V, Shahri PK, Ahani H. [1]
Compressed Sensing is a method that exploits the sparsity of signals to enable recovery from fewer measurements than what classical Nyquist sampling theory requires. In essence, CS leverages the fact that many real-world signals (
Given a sparse signal
Compressed sensing aims to recover
-
$x \in \mathbb{R}^n$ real signal coming from sensors -
$y \in \mathbb{R}^m$ compressed measurement -
$\Psi \in \mathbb{R}^{n \times n}$ is the dictionary (same as explained in previous section) -
$\Phi \in \mathbb{R}^{m \times n}$ with$m \ll n$ is the measurement matrix. -
$s \in \mathbb{R}^n$ is the sparse representation of$x$ in$\Psi$
Such system of equations is under-determined since there are infinitely many consistent solution
where
The optimization is non-convex, and in general, the solution can only be found with a brute-force search that is combinatorial in
In the present project different measurement matrices are explored: deterministic DBBD (deterministic binary block diagonal) and randomic gaussian or binary (binary both normalized and not). The idea is to test which dictionaries work best on a given the given measurement matrices.
The success of CS hinges on the sparsity of the signal. Sparsity refers to how efficiently a signal can be represented using only a few non-zero coefficients in a dictionary. A dictionary is a set of basis functions used to represent the signal in a sparse form.
There are two types of dictionaries:
- Fixed Dictionaries: Predefined, like the Discrete Cosine Transform (DCT), used as a baseline in this project.
- Adaptive Dictionary Learning: Methods that adapt the dictionary to the data, aiming for better representation. [2] Two key adaptive methods explored here are:
These adaptive methods aim to outperform fixed dictionaries by tailoring the dictionary to the specific signal.
The Smoothed L0 (SL0) algorithm is employed to solve the sparse recovery problem. SL0 approximates the L0 norm, which counts the number of non-zero coefficients, with a smooth function. It minimizes this approximation while maintaining accuracy in signal recovery. [6]
The Kronecker technique is used in this project to exploit the sparsity structure of the signal. It involves constructing a Kronecker product of smaller measurement matrices, which leads to computational efficiency and better recovery performance in certain scenarios. The project evaluates the performance of the Kronecker technique in conjunction with fixed and adaptive dictionaries. [5]
Refer to the Official Documentation Website for more in depth description.
The CompSensePack package is designed to handle the core tasks of compressed sensing and dictionary learning. It provides modular tools and high-level abstractions for applying compressive sensing techniques to various signals, with a focus on electrocardiogram (ECG) data in this project. The package is divided into two major components: the main package, which handles signal processing, sparse recovery, and general utilities, and the dictionaries subpackage, which is focused on dictionary learning techniques.
The main package offers a range of modules to perform the following tasks:
- SL0 Algorithm
- Measurement Matrix Generation
- Utilities
- Evaluation and Plotting
- Compressed Sensing Class
The Dictionaries Subpackage is a specialized part of the library (it's contained in the CompSensePack), dedicated to the generation and learning of dictionaries. Dictionaries are the set of basis functions in which the signal is sparsely represented, and they are central to the performance of compressed sensing. The subpackage offers both fixed dictionaries and adaptive dictionary learning algorithms, enabling flexibility in how the signal is represented and reconstructed.
- KSVD Dictionary Learning
- MOD Dictionary Learning
- OMP Algorithm
- DCT Dictionary
- Dictionary Utilities
The Scripts directory contains Python scripts to run the experiments and test the package functionality:
Main experiments runned
study_100m_signal.py
: Applies the compressed sensing techniques (DCT, MOD, K-SVD) to a specific ECG signal from the MIT-BIH Arrhythmia Database (Record 100), comparing the performance of the different methods.visualize_wfdb_signals.py
: Allows users to visualize the compressed sensing results on a signal from the MIT-BIH database using different dictionaries and measurement matrices.
Debug scripts to test single modules
testDictionaries.py
: Tests dictionary learning methods like DCT, MOD, and K-SVD.testEval.py
: Tests evaluation methods for comparing reconstructed signals.testMeasurementMatrix.py
: Tests different measurement matrices used in compressed sensing.testSL0.py
: Verifies the implementation and performance of the SL0 algorithm.testUtils.py
: Tests utility functions within the package.
- Instruction to run in local environment
- The notebook used during construction of the program is still available here.
if [[ "$VIRTUAL_ENV" != "" ]]; then
# Deactivate any active virtual environment if one exists
deactivate
# Remove any previous instance of .venvCsp
rm -r .venvCsp
fi
# Create and activate a new virtual environment, ensure 'python' refers to python3
python -m venv .venvCsp
source .venvCsp/bin/activate
# Deactivate any active virtual environment if one exists (Git Bash syntax)
if [[ "$VIRTUAL_ENV" != "" ]]; then
deactivate
rm -r .venvCsp
fi
# Create and activate a new virtual environment, ensure 'python' is python3
python -m venv .venvCsp
source .venvCsp/Scripts/activate
pip install .
Run from the root directory of the project!
- Test the various measurement matrices, dictionaries, Kronecker technique, and so on.
- Change the parameters at the bottom of the script in the
__main__
, feel free to experiment. It works with a single record, which is contained in data directory.
python Scripts/ecgStudy/study_100m_signal.py
- This script will process the signals and plot reconstructed version over the original. It's possible to download any record of the MIT-BIH Arrhythmia Database available through the wfdb library.
- Feel free to change parameters.
python Scripts/ecgStudy/visualize_wfdb_signals.py
If you'd like to contribute to this project, feel free to open issues and submit pull requests. Any improvements or additional features are welcome.
-
Izadi V, Shahri PK, Ahani H. "A compressed-sensing-based compressor for ECG." Biomed Eng Lett. 2020 Feb 6;10(2):299-307. doi: 10.1007/s13534-020-00148-7. PMID: 32431956; PMCID: PMC7235110.
-
Olshausen, B., Field, D. "Emergence of simple-cell receptive field properties by learning a sparse code for natural images." Nature 381, 607–609 (1996). doi: 10.1038/381607a0.
-
Engan, K., Aase, S., Husoy, J. "Method of Optimal Directions for frame design." ICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings. 5. 2443 - 2446 vol.5. 1999. doi: 10.1109/ICASSP.1999.760624.
-
M. Aharon, M. Elad and A. Bruckstein. "K-SVD: An algorithm for designing overcomplete dictionaries for sparse representation." IEEE Transactions on Signal Processing, vol. 54, no. 11, pp. 4311-4322, Nov. 2006. doi: 10.1109/TSP.2006.881199.
-
Zanddizari H, Rajan S, Zarrabi H. "Increasing the quality of reconstructed signal in compressive sensing utilizing Kronecker technique." Biomed Eng Lett. 2018 Jan 31;8(2):239-247. doi: 10.1007/s13534-018-0057-4. PMID: 30603207; PMCID: PMC6208527.
-
H. Mohimani, M. Babaie-Zadeh and C. Jutten. "A Fast Approach for Overcomplete Sparse Decomposition Based on Smoothed ℓ0 Norm." IEEE Transactions on Signal Processing, vol. 57, no. 1, pp. 289-301, Jan. 2009. doi: 10.1109/TSP.2008.2007606.
-
Brunton, S. L., Kutz, J. N. "Data-Driven Science and Engineering: Machine Learning, Dynamical Systems, and Control." Cambridge University Press, 2019. ISBN: 9781108422093. Available at Cambridge University Press.
- The possibility of "mantaining one or more of the initial atoms" in the Dictionary in the KSVD implementation is not working properly, it's is not a particularly relevant feature in Ecg Signal processing
- The record "100m.mat" is weird... It's the one which I performed the main study of best methods, but after the study I realized that the SNR values and the whole record do not match with the actual record from
wfdb
: in the future it would be advisable to repeat the study with an actual record fromwfdb