S-Edge: Efficient and interpretable raw audio classification with diagonal state space models

This repository provides code and implementation for the paper

Efficient and interpretable raw audio classification with diagonal state space models
Matthias Bittner*, Daniel Schnöll*, Matthias Wess Axel Jantsch Mach Learn 114, 175 (2025), presented at ECMLPKDD'25

* Equal contribution

DOI-Paper Link

C++ implementation for S-Edge

Figure 1: The discrete recurrent forward path of our proposed S-Edge compared to the Original S5.

The S-Edge layer builds on the prior S5 paper, while S5 in implemented in JAX s5-jax version we operate in PyTorch, and started with the PyTorch ported version of S5 s5-pytorch, where we just used and modified the base implementation of the SSM class.

Requirements & Installation

To run the code on your machine you can install the requirements by running python setup.py within your venv.

Repository Structure

Directories and files within the repo

hyperparameters                     Hyperparameter configurations
model_descriptions                  Architetcure configurations S-Edge-{Full-Tiny}
src/                                Source code
    model           Models, RZOH-Regularaization, etc.
    utils           Utils for datasets, experiment manager etc.

full_model_to_code_test.py          Performes training, export, conversion, and evaluation of S-Edge (with default values)
model_cpp_convert.py                Converts the exported weights into a C++ implementation
model_export.py                     Weight export for C++ implementation 
model_interpretability.ipynb        Interpretability of S-Edge (Eigenvalue, Transfer Function, Discret. Errors)  
queue_training.py                   Generate QUEUE for reproduzing all results 
setup.py                            Installs all requirements
test_sc_add_params_macs_to_test.py  Test whole results folder for inference frquencies of 16kHz, 8kHz, 4kHz
test_sc_cpp_impl.py                 Testing C++ implemented as if it were a PyTorch model*
test_sc.py                          Test single model with layer-wise downsampling (step-scale sweep)
train_analys_notebook.ipynb         Train History analysis of results
train_sc_from_queue.py              When executed trains the next model within the Queue (generated with queue_training.py)
train_sc.py                         Train a single model configuration

*The .so file for the C++ inference might not work properly unless it's compiled locally, since shared objects are often sensitive to system architecture, Python version, and linked libraries.

Directories that may be created on-the-fly:

data/               At first training Speech Commands automatically download
results_journal/    Results for individual runs (train_sc.py)       
results_vsc/        When reproducing all model training runs (generated with queue_training.py)
export_model/       The default location of exported models (model_export.py)
cpp_model/          The default location, for copying and generating C++ implementation (model_cpp_convert.py)

How to get from a untrained model towards the C++ Implementation?

Choose a model configuration S-Edge-{L-Tiny}
Train the model with train_sc.py
Within the results_journal folder you can see observe the training run with run id e.g., Run#0.
Play with analysis tools and methods
Decide which downsampling, (step-scale) configurations you would like to apply/test. test_sc.py runs a full sweep for a selected model.
model_export.py tests a model with differnt step scales and compares the Pytorch vs Numpy accuracy.
Use the generated model_dict.npy with the C++ implementation for S-Edge, by configuring and running model_cpp_convert.py
If desired modify the cpp_model/test_SEdge.cpp file to enable benchmarking data.
If porting it to an MCU check out the release STM implementation, especially nn.hppand nn.cpp for minimal overhead and C compadability.

Citation

Please use the following when citing our work:

@article{bittner2025sedge,
  author       = {Matthias Bittner and
                  Daniel Schnöll and
                  Matthias Wess and
                  Axel Jantsch},
  title        = {Efficient and interpretable raw audio classification with diagonal
                  state space models},
  journal      = {Mach. Learn.},
  volume       = {114},
  number       = {8},
  pages        = {175},
  year         = {2025},
  url          = {https://doi.org/10.1007/s10994-025-06807-z},
  doi          = {10.1007/S10994-025-06807-Z},
  timestamp    = {Sun, 29 Jun 2025 21:59:59 +0200},
  biburl       = {https://dblp.org/rec/journals/ml/BittnerSWJ25.bib},
  bibsource    = {dblp computer science bibliography, https://dblp.org}
}

Do not heasitate to reach out if you have any questions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

S-Edge: Efficient and interpretable raw audio classification with diagonal state space models

Requirements & Installation

Repository Structure

How to get from a untrained model towards the C++ Implementation?

Citation

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
Cpp-NN @ fd66252		Cpp-NN @ fd66252
figures		figures
hyperparams		hyperparams
model_descriptions		model_descriptions
patches		patches
src		src
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md
full_model_to_code_test.py		full_model_to_code_test.py
model_cpp_convert.py		model_cpp_convert.py
model_export.py		model_export.py
model_interpretability.ipynb		model_interpretability.ipynb
queue_training.py		queue_training.py
req.txt		req.txt
setup.py		setup.py
test_sc.py		test_sc.py
test_sc_add_params_macs_to_test.py		test_sc_add_params_macs_to_test.py
test_sc_cpp_impl.py		test_sc_cpp_impl.py
train_analysis_notebook.ipynb		train_analysis_notebook.ipynb
train_sc.py		train_sc.py
train_sc_from_queue.py		train_sc_from_queue.py

License

embedded-machine-learning/S-Edge

Folders and files

Latest commit

History

Repository files navigation

S-Edge: Efficient and interpretable raw audio classification with diagonal state space models

Requirements & Installation

Repository Structure

How to get from a untrained model towards the C++ Implementation?

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages