Skip to content

dustinwloring1988/liteformer-source

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

27 Commits
 
 
 
 
 
 

Repository files navigation

LiteFormer

License Python Stars

LiteFormer is a lightweight, research-focused fork of Hugging Face Transformers.

⚠️ Not intended for production use. This repository serves as a rapid prototyping and experimentation playground for transformer research.


✨ Features

  • Lightweight: Removes TensorFlow, Flax, ONNX, Sagemaker, and multilingual doc support.
  • Focused: Keeps only your experimental models:
    aformer, bformer, cformer, dformer, eformer, fformer, mformer, nformer, oformer, sformer, tformer, vformer
  • Modular & Clean: Cleaner structure and smaller size for fast exploration.
  • Custom Code Injection: Injects your own model and tokenizer files automatically.
  • Placeholder Docs & Tests: Basic test folders and markdown docs are generated for each model.

⚙️ Setup

Clone this repo and run the builder script:

python lite_transformers_builder.py

This will:

  • Clone the official transformers repo
  • Remove all unused backends, examples, tests, and files
  • Retain only the models you define in the script
  • Add placeholder folders and inject your own files for experimentation

🧠 Included Models

These models are retained and initialized for experimentation:

- aformer : Any Transformer : Any In Any Out (think chameleon)
- bformer : Basic Transformer : Text In Text Out (think gemma2)
- cformer : Clip Transformer : Image In Text Out (think clip)
- dformer : Drawing Transformer : Text In Image Out (think aMUSEd)
- eformer : MOE Transformer : Any In Text Out (think llama4)
- fformer : First Transformer : Text In Text Out (think gpt2 or nanogpt)
- mformer : Mat Transformer : Text In Text Out (think basic version of gemma3n)
- nformer : NewMat Transformer : Any In Text Out (think gemma3n)
- oformer : Omni Transformer : Any In Text Out (think qwen2_5_omni)
- sformer : Speech Transformer : Audio In Text Out (think whisper)
- tformer : Talking Transformer : Text In Audio Out (think whisper)
- vformer : Vision Transformer : Text, Vision In Text Out (think gemma3)

Each model gets:

  • A src/transformers/models/{model} folder
  • A test folder in tests/models/{model}
  • A documentation stub in docs/model_doc/{model}.md

🗂 Output Structure

liteformer/
├── src/
│   └── transformers/
│       ├── models/
│       │   ├── aformer/
│       │   ├── ...
│       │   └── auto/
│       ├── generation/
│       └── ...
├── tests/
│   └── models/
│       ├── aformer/
│       ├── ...
├── docs/
│   ├── model_doc/
│   ├── index.rst
│   └── ...
├── utils/
├── docker/
└── examples/

🧪 Adding Your Own Model

To add a new model (e.g. newformer):

  1. Create your files:

    • __init__.py
    • modular_newformer.py
    • config_newformer.py
    • tokenization_newformer.py
    • tokenization_newformer_fast.py (optional)
    • processing_newformer.py (required if multimodality)
    • image_processing_newformer.py (required for vision)
    • image_processing_newformer_fast.py (optional for vision)
    • feature_extraction_newformer.py (required for audio)
  2. Add "newformer" to the NEW_MODELS list in lite_transformers_builder.py

  3. Run the script again:

python lite_transformers_builder.py

🎯 Goals

  • Minimize boilerplate and complexity during research
  • Focus on experimental transformer variants
  • Speed up iteration time when prototyping architectures
  • Provide a simple foundation for building new ideas

🧰 Requirements

  • Python 3.8+
  • Git
  • PyTorch
  • (Optional) pytest for running tests

📋 TODO

  • Make repo compile by removing or fixing invalid imports and paths
  • Replace the current version in src/transformers/__init__.py with version = "0.1.0-lite"
  • Reduce auto classes to only what is minimally necessary
  • Replace placeholder models with real SOTA architectures (more info coming)
  • Write clean, user-focused documentation on usage and architecture
  • Upload as a package to PyPI for easier installation

🤝 Contributing

This is a personal research tool but you're welcome to fork it and adapt it for your own projects. Pull requests are not expected but feedback is welcome.


🧭 Why This Project Exists

The main Hugging Face transformers repository is incredibly powerful but often too large and slow to iterate within. LiteFormer aims to strip things down to the essentials — making it easier to test hypotheses, design new model types, and experiment with architecture changes in a clean environment.


📜 License

This project is a derivative work of Transformers, licensed under the Apache 2.0 License.


👤 Author

Created by Dustin Loring

GitHub Repo: github.com/dustinwloring1988/liteformer

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •