Different Techniques To Reduce Inference Time of Deep Learning Models

Techniques Implemented

Transfer Learning

Purpose: Reusing pre-trained models on large datasets (e.g., ImageNet) to accelerate training and improve performance on smaller, target datasets.
Models: VGG, Inception, ResNet, MobileNet, DenseNet
Benefits:
- Faster training
- Better performance with limited data
- Reduced computational overhead

Quantization

Purpose: Reducing model size and improving inference speed by converting high-precision floating-point weights to lower-precision formats (e.g., 16-bit, 8-bit).
Techniques:
- Post-training quantization (PTQ)
- Quantization-aware training (QAT)
Challenges: Potential loss of accuracy due to reduced dynamic range.

Knowledge Distillation: Enhancing Model Efficiency

Knowledge distillation is a machine learning technique where a larger, more complex model (the teacher) transfers its knowledge to a smaller, simpler model (the student). This method is particularly useful for:

Training the Teacher Model: A large model is trained on a dataset to achieve high accuracy.
Generating Soft Targets: The teacher produces probability distributions over classes instead of hard labels, reflecting its confidence in each class.
Training the Student Model: The student is trained using a combined loss function, incorporating traditional cross-entropy loss and a distillation loss that encourages it to replicate the soft targets from the teacher.

This approach allows the student model to maintain performance while being more efficient and easier to deploy.

Pruning

Purpose: Removing unnecessary parameters from the model to improve efficiency and reduce computational complexity.
Methods: Removing entire feature maps from convolutional layers.
Benefits:
- Faster inference
- Smaller model size
- Reduced memory footprint

Implementation Details

Framework: [PyTorch]
Hardware: [Intel Xenon -8480 , Nvidia A6000]
Optimization Tools: [Pytorch,ONNX]

Working Video

You can see the realtime working implementaion on a FPGA here

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
Knowledge-Distillation		Knowledge-Distillation
Pruning		Pruning
Quantization		Quantization
Transfer learning		Transfer learning
README.md		README.md
cnnonelayeroutput.ipynb		cnnonelayeroutput.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Different Techniques To Reduce Inference Time of Deep Learning Models

Techniques Implemented

Transfer Learning

Quantization

Knowledge Distillation: Enhancing Model Efficiency

Pruning

Implementation Details

Working Video

About

Uh oh!

Releases

Packages

Languages

AYUSH27112021/Optimizing-Deep-Learning-Efficiency

Folders and files

Latest commit

History

Repository files navigation

Different Techniques To Reduce Inference Time of Deep Learning Models

Techniques Implemented

Transfer Learning

Quantization

Knowledge Distillation: Enhancing Model Efficiency

Pruning

Implementation Details

Working Video

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages