Meet Aria. A local and uncensored AI entity.
aria_demo.mp4
aria_demo_1.mp4
- Python 3.10 or higher
- NVIDIA GPU with CUDA support (required for flash-attn)
- System dependencies:
- Ubuntu/Debian:
sudo apt install python3.12-dev portaudio19-dev libopus-dev
- Arch Linux:
sudo pacman -S python portaudio opus
- macOS:
brew install portaudio opus
- Ubuntu/Debian:
uv is a fast Python package manager that provides better dependency resolution and faster installations.
# Install uv (if not already installed)
curl -LsSf https://astral.sh/uv/install.sh | sh
# Clone the repo
git clone https://github.com/neuralnetwork/aria.git
cd aria
# Install dependencies
uv sync
# Install from pyproject.toml
pip install -e .
# Note: flash-attn requires CUDA Toolkit and may need manual installation
(Tested on Arch Linux + NVIDIA GPUs with Python 3.12)
When using uv, you can easily sync your environment across multiple computers:
# On your first computer (after installation)
git add uv.lock
git commit -m "Add uv.lock for dependency sync"
git push
# On your other computers
git pull
uv sync
The uv.lock
file ensures all your computers use exactly the same package versions.
First run will take a while to download all the required models.
You may edit the default config for your device or use case (change model, specify devices, etc...)
If you have the resources, strongly recommended to use bigger model and/or bigger quant method.
Aria now uses Qwen2.5-14B-Instruct-1M-abliterated (Q6_K quantization) by default:
- Model size: ~12.2 GB
- Recommended VRAM: 16GB+ (works well on RTX 4080 Super)
- Downloads automatically on first run to HuggingFace cache
- Abliterated version (uncensored) for unrestricted responses
If you're developing with WSL but running Aria on Windows:
- Edit code in WSL using your preferred tools
- Run Aria directly on Windows for best GPU performance
- Use Windows terminal:
uv run python main.py
- Flash-attn builds properly on native Windows with CUDA
python main.py
python server.py
python client.py
- Android client
- Raspberry Pi client
- Ollama support (currently uses GGUF format directly)
Work in progress...
🌟 We'd love your contribution! Please submit your changes via pull request to join in the fun! 🚀
Aria is a powerful AI entity designed for local use. Users are advised to exercise caution and responsibility when interacting with Aria, as its capabilities may have unintended consequences if used improperly or without careful consideration.
By engaging with Aria, you understand and agree that the suggestions and responses provided are for informational purposes only, and should be used with caution and discretion.
We cannot be held responsible for any actions, decisions, or outcomes resulting from the use of Aria. We explicitly disclaim liability for any direct, indirect, incidental, consequential, or punitive damages arising from reliance on Aria's responses.
We encourage users to exercise discernment, judgment, and thorough consideration when utilizing information from Aria. Your use of this service constitutes acceptance of these disclaimers and limitations.
Should you have any doubts regarding the accuracy or suitability of Aria's responses, we advise consulting with qualified professionals or experts in the relevant field.
- silero-vad
- transformers
- whisper
- llama-cpp-python
- TTS
- TTS fork
- kokoro
- opuslib
- TheBloke
- Bartowski
- mradermacher
- mlabonne
While this project is licensed under GNU AGPLv3, the usage of some of the components it depends on might not and they will be listed below:
- License: Open-source only for non-commercial projects.
- Commercial Use: Requires a paid plan.
- Details: Coqui Public Model License 1.0.0
- License: BSD-3-Clause license
- Details: opuslib license
- License: llama3.1
- Details: llama license