Geometric Knowledge Distillation

Overview

Motion sequence classification using geometric approaches like Procrustes analysis demonstrates high accuracy but suffers from computational inefficiency at inference time. We present a novel knowledge distillation framework that bridges this gap by transferring geometric understanding from Procrustes combined with Dynamic Time Warping (Procrustes-DTW) distance computations to an efficient neural network. Our approach uses pre-computed Procrustes-DTW distances to generate soft probability distributions that guide the training of a transformer-based student model. This ensures the preservation of crucial geometric properties—including shape similarities, temporal alignments, and invariance to spatial transformations—while enabling fast inference. We evaluate our framework on two challenging tasks: sign language recognition using the SIGNUM dataset and human action recognition using the UTD-MHAD dataset. Experimental results demonstrate that geometric knowledge transfer improves accuracy compared to training a deep neural network using standard supervised learning while achieving significantly faster inference times compared to distance-based approaches. The framework shows particular promise for real-time applications where both geometric understanding and computational efficiency are essential.

This repository implements Geometric Knowledge Distillation, applying transformation-based knowledge distillation techniques to improve machine learning model performance. The project focuses on three primary models:

KNN (k-Nearest Neighbors)
Transformer-based models
Distillation models

Framework Description

The experiments are conducted on two datasets:

Skeleton Dataset (included in the repository)
Signum Dataset (not included due to size limitations)

Repository Structure

.github/workflows/      # CI/CD workflows (GitHub Actions)
data/                   # Skeleton dataset (Signum dataset not included due to size)
docs/                   # Documentation files
results/                # Output results from experiments
scripts/                # Scripts for running different algorithms
tests/                  # Unit tests for verifying implementations
src/                    # Core source files for geometric knowledge distillation
.amlignore              # Azure ML ignore file (similar to .gitignore)
.gitignore              # Files ignored by Git
.pre-commit-config.yaml # Pre-commit hooks configuration
CITATION.cff            # Citation information
LICENSE                 # License details
README.md               # This file
environment.yml         # Conda environment setup file
setup.py                # Project setup file

Scripts

The scripts/ directory contains implementations for different algorithms used in the project:

scripts/
│── knn_signum.py                  # KNN model for Signum dataset
│── knn_skeleton_git.py             # KNN model for Skeleton dataset
│── signum_distillation.ipynb       # Distillation model for Signum dataset
│── signum_transformer.ipynb        # Transformer model for Signum dataset
│── skeleton_distillation.ipynb     # Distillation model for Skeleton dataset
│── skeleton_procrustes.py          # Procrustes analysis on Skeleton dataset
│── skeleton_transformer.ipynb      # Transformer model for Skeleton dataset

Setup

Conda Virtual Environment

To set up the environment, follow these steps:

Create the Conda virtual environment using environment.yml:
```
conda env create -f environment.yml
```
Activate the environment:
```
conda activate distillation
```

Set the Python path dynamically:

conda env config vars set PYTHONPATH=$(pwd):$(pwd)/src

Verify the environment setup:
```
conda info --envs
```

Dependencies

This project requires the following dependencies:

Python >= 3.7
PyTorch
scikit-learn
transformers
matplotlib
numpy
pandas
tqdm

Data Availability

Skeleton Dataset is included in data/
Skeleton Results are available in results/
Signum Dataset & Results are not included due to size limitations

Running Experiments

To run the models on the datasets:

python scripts/skeleton_procrustes.py  # Runs the skeleton dataset experiments
python scripts/knn_signum.py           # Runs KNN on Signum dataset
python scripts/knn_skeleton_git.py     # Runs KNN on Skeleton dataset

For other experiments, refer to the .ipynb notebooks in the scripts/ directory.

Results and Discussion

Tables below present the classification performance and computational efficiency of each approach on the SIGNUM and UTD-MHAD datasets, respectively.

Results on SIGNUM Dataset (Test Set)

Method	Acc. (%)	Prec. (%)	Rec. (%)	F1 (%)	Infer. Time (ms/sample)
Procrustes-DTW (k-NN)	63.9	68.2	64.4	63.1	$3.6 \times 10^6$
Transformer (Direct)	86.9	89.5	87.1	86.6	0.22
Ours (Distillation)	90.2	91.7	90.2	89.8	0.35

Results on UTD-MHAD Dataset (Test Set)

Method	Acc. (%)	Prec. (%)	Rec. (%)	F1 (%)	Infer. Time (ms/sample)
Procrustes-DTW (k-NN)	31.9	38.9	32.1	28.4	$1.89 \times 10^5$
Transformer (Direct)	57.5	60.9	57.6	55.6	0.21
Ours (Distillation)	64.9	67.2	64.9	63.9	0.83

License

This project is licensed under the MIT License. See the LICENSE file for details.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Geometric Knowledge Distillation

Overview

Framework Description

Repository Structure

Scripts

Setup

Conda Virtual Environment

Dependencies

Data Availability

Running Experiments

Results and Discussion

Results on SIGNUM Dataset (Test Set)

Results on UTD-MHAD Dataset (Test Set)

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 32 Commits
data		data
docs		docs
results		results
scripts		scripts
src		src
tests		tests
.gitignore		.gitignore
CITATION.cff		CITATION.cff
LICENSE		LICENSE
README.md		README.md
environment.yml		environment.yml
setup.py		setup.py

License

imics-lab/geometric-knowledge-distillation

Folders and files

Latest commit

History

Repository files navigation

Geometric Knowledge Distillation

Overview

Framework Description

Repository Structure

Scripts

Setup

Conda Virtual Environment

Dependencies

Data Availability

Running Experiments

Results and Discussion

Results on SIGNUM Dataset (Test Set)

Results on UTD-MHAD Dataset (Test Set)

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages