A vibe coded implementation of Soft Actor-Critic (SAC) using PyTorch and MLFlow. This project demonstrates how transformer architectures can be integrated with SAC to effectively capture long-term dependencies in reinforcement learning tasks.
- Torch-Powered: Every component leverages Torch's optimized workflows for data collection, preprocessing, and model training
- Transformer-Enhanced RL: Novel integration of transformer architecture with SAC for superior temporal reasoning
- MLflow Integration: Complete experiment tracking with parameter logging and model versioning
- Modular Design: Clean separation of environment, model, and training components for easy extension
- Rapid Development: This implementation was developed in approximately 12 hours as a proof-of-concept, demonstrating rapid prototyping capability while maintaining a clean architecture. It showcases the ability to quickly deliver working machine learning systems.
- Production Readiness: While built as a rapid prototype, the codebase follows a modular design with clear separation between environment, models, and training components. If continued, future iterations will focus on implementing proper logging with configurable verbosity levels and comprehensive exception handling.
-
Clone the repository:
gh repo clone MichaelsEngineering/sac-agent-demo cd sac-agent-demo
-
Create an env and install the required dependencies:
python -m venv sac-env source sac-env/bin/activate # Linux/macOS # Windows: sac-env\Scripts\activate pip install -e . # Or, for development, include additional dev dependencies pip install -e ".[dev]"
-
Track Experiments:
mlflow ui # Then open in browser
- Run main:
python src/main.py
- Run the project
python src/main.py
├── .idea/
├── src/
│ ├── data/
│ │ ├── simple.json
│ └── deployment/
│ ├── environments/
│ │ ├── environment_setup.py
│ ├── models/
│ │ ├── actor.py
│ │ └── critic.py
│ ├── training/
│ │ ├── replay_buffer.py
│ │ ├── train_sac.py
│ ├── training/
| | └── load_json.py
│ └── main.py
├── tests/
│ └── test_model.py
├── .GITIGNORE
├── LICENSE.md
├── pip_requirements.txt
├── pyproject.toml
└── README.md
- Upgrade Data Loading for CI/CD engineering best practices
- Add support for continuous action spaces
- Enhance MLflow dashboards
- Containerize with Docker for reproducible deployment
Contributions are welcome! Please feel free to submit a Pull Request.
- Fork the repository
- Create your feature branch (
git checkout -b feature/AmazingFeature
) - Commit your changes (
git commit -m 'Add some AmazingFeature'
) - Push to the branch (
git push origin feature/AmazingFeature
) - Open a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.