🚀 Advanced AI Model Pruner

🎯 Reduce AI model size by up to 90% while maintaining performance quality

A comprehensive toolkit for pruning large language models and neural networks with multiple advanced strategies. Features both command-line and GUI interfaces for maximum flexibility.

✨ Features

🔧 Multiple Pruning Strategies

🎯 Magnitude-Based Pruning - Remove smallest weights (unstructured)
🏗️ Structured Pruning - Remove entire neurons/attention heads
⏱️ Gradual Pruning - Iterative pruning with fine-tuning recovery
🎓 Knowledge Distillation - Train smaller student models
🔄 Comprehensive - Combined pipeline for maximum compression

🖥️ Dual Interface Options

💻 Command Line Interface - Perfect for scripting and automation
🎨 Modern GUI - User-friendly interface with real-time progress tracking

⚡ Advanced Optimizations

📊 Dynamic Quantization - Reduce memory footprint further
🧠 Real Dataset Integration - Uses WikiText for fine-tuning
🔍 Model Validation - Automatic output verification
📈 Detailed Reporting - Comprehensive pruning statistics

📁 Input Model Folder Structure Guide

🖼️ View Input Model Folder Structure Guide

The AI Model Pruner is designed to work with HuggingFace-compatible models. When you select an input model folder, it should contain the standard files that HuggingFace models use.

📂 Required Folder Structure

✅ Standard HuggingFace Model Structure

your-model-folder/
├── 📄 config.json              # ← REQUIRED: Model architecture configuration
├── 🧠 pytorch_model.bin        # ← REQUIRED: Model weights (PyTorch format)
│   OR
├── 🧠 model.safetensors         # ← ALTERNATIVE: Model weights (SafeTensors format)
├── 🔤 tokenizer.json           # ← REQUIRED: Tokenizer configuration
├── 🔤 tokenizer_config.json    # ← REQUIRED: Tokenizer settings
├── 📝 vocab.txt                # ← REQUIRED: Vocabulary file
│   OR
├── 📝 vocab.json               # ← ALTERNATIVE: Vocabulary (JSON format)
├── 📝 merges.txt               # ← OPTIONAL: BPE merges (for some tokenizers)
└── 📋 special_tokens_map.json  # ← OPTIONAL: Special token mappings

🔍 File Descriptions

File	Purpose	Required?	What it Contains
`config.json`	🏗️ Architecture	✅ YES	Model dimensions, layer count, attention heads
`pytorch_model.bin`	🧠 Weights	✅ YES	All trained parameters (billions of numbers)
`model.safetensors`	🧠 Weights	✅ ALT	Same as above, but safer format
`tokenizer.json`	🔤 Text Processing	✅ YES	How to convert text to numbers
`tokenizer_config.json`	⚙️ Tokenizer Settings	✅ YES	Tokenizer behavior configuration
`vocab.txt` / `vocab.json`	📝 Vocabulary	✅ YES	All words/tokens the model knows
`merges.txt`	🔗 BPE Rules	❓ MAYBE	Word splitting rules (GPT-style models)
`special_tokens_map.json`	🏷️ Special Tokens	❓ OPTIONAL	[CLS], [SEP], [PAD] token definitions

📥 Where to Get Compatible Models

🤗 From HuggingFace Hub

Huggingface Models

🐍 Using Python Downloder (Automatic Download)

HuggingGGUF Downloder

✅ Examples of Compatible Models

📝 Text Models (BERT-style)

bert-base-uncased/
├── config.json
├── pytorch_model.bin
├── tokenizer.json
├── tokenizer_config.json
├── vocab.txt
└── special_tokens_map.json

💬 Generation Models (GPT-style)

gpt2/
├── config.json
├── pytorch_model.bin
├── tokenizer.json
├── tokenizer_config.json
├── vocab.json
├── merges.txt
└── special_tokens_map.json

🔄 Encoder-Decoder Models (T5-style)

t5-small/
├── config.json
├── pytorch_model.bin
├── tokenizer.json
├── tokenizer_config.json
├── spiece.model              # ← SentencePiece tokenizer
└── special_tokens_map.json

❌ What WON'T Work

🚫 Unsupported Formats

❌ pure-pytorch-model/
├── model.pth                 # Raw PyTorch state dict
└── custom_config.py          # Custom Python configuration

❌ tensorflow-model/
├── saved_model.pb            # TensorFlow format
└── variables/

❌ onnx-model/
├── model.onnx                # ONNX format
└── config.yaml

❌ custom-format/
├── weights.dat               # Custom binary format
└── architecture.xml          # Custom config

🔧 How to Convert Models

🔄 From PyTorch State Dict

import torch
from transformers import AutoConfig, AutoModel

# If you have a raw PyTorch model
state_dict = torch.load("model.pth")

# You need to create a compatible config
config = AutoConfig.from_pretrained("bert-base-uncased")  # Use similar model as template
model = AutoModel.from_config(config)
model.load_state_dict(state_dict)

# Save in HuggingFace format
model.save_pretrained("./converted-model")

🔄 From TensorFlow

from transformers import TFAutoModel, AutoModel

# Load TensorFlow model
tf_model = TFAutoModel.from_pretrained("tf-model-path", from_tf=True)

# Convert to PyTorch
pytorch_model = AutoModel.from_pretrained("tf-model-path", from_tf=True)
pytorch_model.save_pretrained("./converted-model")

🕵️ How to Verify Your Model Folder

🔍 Quick Check Script

import os
from pathlib import Path

def check_model_folder(folder_path):
    folder = Path(folder_path)
    
    # Required files
    required = ["config.json", "tokenizer_config.json"]
    
    # Need either pytorch_model.bin OR model.safetensors
    weights_files = ["pytorch_model.bin", "model.safetensors"]
    
    # Need either vocab.txt OR vocab.json
    vocab_files = ["vocab.txt", "vocab.json"]
    
    print(f"🔍 Checking {folder}...")
    
    # Check required files
    for file in required:
        if (folder / file).exists():
            print(f"✅ {file} - Found")
        else:
            print(f"❌ {file} - Missing")
    
    # Check weights
    weights_found = any((folder / f).exists() for f in weights_files)
    if weights_found:
        found_weight = next(f for f in weights_files if (folder / f).exists())
        print(f"✅ {found_weight} - Found")
    else:
        print(f"❌ No weight files found ({', '.join(weights_files)})")
    
    # Check vocab
    vocab_found = any((folder / f).exists() for f in vocab_files)
    if vocab_found:
        found_vocab = next(f for f in vocab_files if (folder / f).exists())
        print(f"✅ {found_vocab} - Found")
    else:
        print(f"❌ No vocabulary files found ({', '.join(vocab_files)})")

# Usage
check_model_folder("./my-model-folder")

🧪 Test Load Script

from transformers import AutoModel, AutoTokenizer, AutoConfig

def test_model_loading(model_path):
    try:
        print(f"🧪 Testing model loading from {model_path}...")
        
        # Try to load config
        config = AutoConfig.from_pretrained(model_path, trust_remote_code=True)
        print("✅ Config loaded successfully")
        
        # Try to load model
        model = AutoModel.from_pretrained(model_path, trust_remote_code=True)
        print("✅ Model loaded successfully")
        
        # Try to load tokenizer
        tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)
        print("✅ Tokenizer loaded successfully")
        
        print(f"🎉 Model is compatible! Parameters: {model.num_parameters():,}")
        return True
        
    except Exception as e:
        print(f"❌ Error loading model: {e}")
        return False

# Usage
test_model_loading("./my-model-folder")

💡 Pro Tips

🎯 Best Practices

Always test load your model before pruning
Keep backups of original model files
Check file sizes - pytorch_model.bin should be the largest file
Verify permissions - ensure the pruner can read all files

🚀 Quick Setup

# Download a test model to get started
python -c "
from transformers import AutoModel, AutoTokenizer
model = AutoModel.from_pretrained('distilbert-base-uncased')
tokenizer = AutoTokenizer.from_pretrained('distilbert-base-uncased')
model.save_pretrained('./test-model')
tokenizer.save_pretrained('./test-model')
print('✅ Test model saved to ./test-model')
"

🆘 Common Issues & Solutions

Issue	Cause	Solution
"Config not found"	Missing `config.json`	Download complete model from HuggingFace
"Model weights not found"	Missing `.bin` or `.safetensors`	Ensure model file downloaded completely
"Tokenizer error"	Missing tokenizer files	Re-download model or copy tokenizer files
"Trust remote code"	Custom model code	Add `trust_remote_code=True` parameter

🎯 Remember: The AI Model Pruner expects the exact same format that HuggingFace uses. If you can load your model with AutoModel.from_pretrained(), then it will work with the pruner!

📋 Requirements

🖥️ Hardware Requirements

Pruning Strategy	Minimum VRAM	Recommended VRAM	Use Case
🎯 Magnitude	8 GB	12+ GB	Quick experimentation
🏗️ Structured	8 GB	12+ GB	Maximum inference speed
⏱️ Gradual	12 GB	24+ GB	Best quality preservation
🎓 Distillation	16 GB	24+ GB	Clean, optimized models
🔄 Comprehensive	12 GB	24+ GB	Maximum compression

💿 System Requirements

OS: Windows 10+, macOS 10.14+, or Linux
Python: 3.8 or higher
Storage: SSD recommended for faster model loading
RAM: 16 GB minimum, 32 GB+ recommended

🚀 Quick Start

📦 Installation

# Clone the repository
git clone https://github.com/LMLK-seal/ai-model-pruner.git
cd ai-model-pruner

# Install dependencies
pip install -r requirements.txt

🎨 GUI Usage

python pruner_gui.py

📁 Select Input Model - Choose your HuggingFace model directory
📂 Choose Output Location - Where to save the pruned model
⚙️ Configure Settings - Pick strategy and compression ratio
▶️ Start Pruning - Watch real-time progress and logs

💻 Command Line Usage

# Basic pruning with magnitude strategy
python model_pruner.py \
  --input_path ./models/original-model \
  --output_path ./models/pruned-model \
  --strategy magnitude \
  --target_reduction 0.75

# Advanced: Comprehensive pruning pipeline
python model_pruner.py \
  --input_path ./models/large-model \
  --output_path ./models/compressed-model \
  --strategy comprehensive \
  --target_reduction 0.85 \
  --validate

🛠️ Pruning Strategies Explained

🎯 Magnitude-Based Pruning

# "Digital Weeding" - Remove smallest weights
pruner.magnitude_based_pruning(sparsity_ratio=0.5)

✅ Pros: Simple, fast, fine-grained control
❌ Cons: Creates sparse models, may need specialized hardware
🎯 Best for: Quick tests, when sparsity is desired

🏗️ Structured Pruning

# "Architectural Demolition" - Remove entire components
pruner.structured_pruning(reduction_ratio=0.3)

✅ Pros: Hardware-friendly, directly reduces complexity
❌ Cons: Coarse-grained, bigger impact per change
🎯 Best for: Maximizing inference speed on standard GPUs

⏱️ Gradual Pruning

# "Slow and Steady" - Iterative pruning with recovery
pruner.gradual_magnitude_pruning(
    initial_sparsity=0.1, 
    final_sparsity=0.7, 
    steps=10
)

✅ Pros: Best quality preservation, adaptive recovery
❌ Cons: Very slow, computationally expensive
🎯 Best for: When model accuracy is critical

🎓 Knowledge Distillation

# "Master and Apprentice" - Train new smaller model
pruner.knowledge_distillation_pruning(
    student_ratio=0.5, 
    epochs=3
)

✅ Pros: Clean optimized models, high potential
❌ Cons: Requires training data, complex process
🎯 Best for: Creating highly optimized models from scratch

🔄 Comprehensive Pipeline

# "Full Treatment" - Combined strategies
pruner.optimize_model_size()

✅ Pros: Maximum compression, best overall results
❌ Cons: Most complex, longest processing time
🎯 Best for: Achieving maximum size reduction

📊 Performance Examples

Original Model	Strategy	Size Reduction	Speed Increase	Quality Loss
GPT-2 Medium (355M)	Structured	60% ↓	2.3x ↑	<5% ↓
BERT Base (110M)	Gradual	75% ↓	1.8x ↑	<3% ↓
RoBERTa Large (355M)	Distillation	80% ↓	4.1x ↑	<8% ↓
Custom Model (1.3B)	Comprehensive	85% ↓	3.2x ↑	<10% ↓

📁 Project Structure

ai-model-pruner/
├── 📄 model_pruner.py        # Core pruning logic
├── 🎨 pruner_gui.py          # GUI interface
├── 📖 README.md              # This file
├── 📋 requirements.txt       # Dependencies

🐛 Bug Reports

Please use the issue tracker to report bugs.

💡 Feature Requests

We'd love to hear your ideas! Open an issue with the enhancement label.

📜 License

This project is licensed under the MIT License - see the LICENSE file for details.

🙏 Acknowledgments

🤗 Hugging Face - For the transformers library
🔥 PyTorch - For the deep learning framework
📊 Datasets Library - For easy dataset integration
🎨 CustomTkinter - For the modern GUI framework

📞 Support

📖 Documentation: Wiki
💬 Discussions: GitHub Discussions
🐛 Issues: Issue Tracker

⭐ Star this repo if it helped you!

Made with ❤️ by LMLK-seal

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
LICENSE		LICENSE
Preview.png		Preview.png
README.md		README.md
model_pruner.py		model_pruner.py
pruner_gui.py		pruner_gui.py
requirements.txt		requirements.txt

License

LMLK-seal/AI-Model-Pruner

Folders and files

Latest commit

History

Repository files navigation