The following is a thesis project conducted within the research lines of the ULTRa laboratory at the University of Trento. Grounded in ultrasound-based settings, it has the aim to enable accurate classification of pulmonary fibrotic tissue size through the analysis of high-frequency probing signals subject to heavy scattering.
This project was born in the context of (quantitative) ultrasound lung spectroscopy applied to the diagnosis of pulmonary fibrosis. In order to detect and classify such fibrotic accumulations, lung ultrasound spectroscopy (LUS) is a cheap and accessible potential solution, especially when compared to techniques like high-resolution computed tomography (HRCT), which is particularly unsuitable for rapid preliminary examinations. Unfortunately, LUS adoption is not devoid of obstacles: for instance, when high-frequency probing signals are sent, fibrotic tissues act as acoustic traps, and the reflected signals are subject to heavy scattering. The direct consequence of this scattering effect is that classifying the fibrosis size directly using such collected signals suddenly becomes not straightforward at all, making LUS alone a less effective candidate to discover pathological situations.
The aim of this project is to complement LUS with modern data-driven computational techniques, in order to re-enable fibrosis classification capabilities even in presence of the aforementioned scattering effects. Since the purpose of ultrasound spectroscopy is to deliver insightful findings with cheap and accessible instrumentations, we will favor techniques which are equally inexpensive and flexible. The field of hyperdimensional computing/vector symbolic architectures (HDC/VSA) provides many of such effective techniques, and it is progressively gaining traction in several applications, including biomedical ones. Harnessing HDC/VSA modelling, our hope is targeted to overcome LUS limitations, thus establishing it as a compelling technique for practical pulmonary pathologies detection.
The following is the list of macro-experiments, in the form of Jupyter notebooks, documenting the evolution of the project in its entirety.
- ETL pipeline: extract-transform-load procedures starting from Matlab simulation-based data
- Knowledge bases embedding: construction of queryable knowledge base embeddings via HDC/VSA
Below, the collection of simulation datasets, refined dataframes and HDC/VSA embeddings produced during the experiments. Aside from data generated by Matlab simulations, which is in .mat file format, all the other datasets are in .nc (NetCDF, version 5) file format.
- Matlab simulations: data generated by Matlab high-fidelity simulations
- Post-ETL datasets: datasets refined in various ways (gaussian squashing, low-signal trimming, interpolative augmentation, normalization, ...)
- Signals embeddings: datasets of signal-encoded embeddings, in various HDC/VSA formats (mainly BSC, FHRR and HRR)
The project is under MIT license; see here.
