CLI supports multiple commands:
-
User Input β The user runs the app and provides:
- a source folder path containing the files to organize
- a destination folder path where organized files will be placed
- an AI model name (loaded in Ollama) used to generate folder names
- an embedding model name (also loaded in Ollama) used to generate vector embeddings
-
Destination Folder Scan
- The app scans the destination folder and generates embeddings for each folder name.
- These embeddings are stored in a Qdrant vector database.
-
Source Folder Scan
- The app scans the source folder and generates embeddings for each file name.
- It compares each fileβs embedding to existing folder embeddings in the database.
- Files without a sufficiently close match are marked for further processing.
-
Clustering & AI Folder Naming
- Unmatched file embeddings are grouped using agglomerative hierarchical clustering.
- Each cluster is sent to the LLM to generate a suggested folder name.
-
Preview Results
- A table is displayed showing the proposed destination for each file.
-
User Decision
- The user reviews the suggested structure and decides whether to apply the changes.
If you decided to not apply changes after process
, you can apply changes later with apply
command. It expects that you didn't change files locations. This command applied migrations from the latest succesfull process
launch.
For the case if after files migrations you are changed your mind and want to return everything back.
β οΈ Warning: Do not usemessy-folder-reorganizer-ai
on important files such as passwords, confidential documents, or critical system files.
In the event of a bug or interruption, the app may irreversibly modify or delete files. Always create backups before using it on valuable data.
The author assumes no responsibility for data loss or misplaced files caused by this application.
π Adding RAG & ML to the CLI
π How cosine similarity helped files find their place
π Teaching embeddings to understand folders
π Hierarchical clustering for file grouping
- Install core developer tools
-
macOS
Install or update **Xcode**
-
Linux x86_64
sudo apt update sudo apt install -y build-essential
-
Install Ollama and start the service.
-
Download the required LLM via Ollama:
ollama pull deepseek-r1:latest
Recommended: Use models with a higher number of parameters for better accuracy.
This project has been tested withdeepseek-r1:latest
(4.7 GB, 7.6B params). -
Download the embedding model:
ollama pull mxbai-embed-large:latest
-
Launch Qdrant vector database (easiest via Docker):
docker pull qdrant/qdrant docker run -p 6333:6333 \ -v $(pwd)/path/to/data:/qdrant/storage \ qdrant/qdrant
-
Download the latest app release:
-
Apple Silicon (macOS ARM64):
curl -s https://api.github.com/repos/PerminovEugene/messy-folder-reorganizer-ai/releases/tags/v0.2.0 | \ grep "browser_download_url.*messy-folder-reorganizer-ai-v0.2.0-aarch64-apple-darwin.tar.gz" | \ cut -d '"' -f 4 | \ xargs curl -L -o messy-folder-reorganizer-ai-macos-arm64.tar.gz
-
Intel Mac (macOS x86_64):
curl -s https://api.github.com/repos/PerminovEugene/messy-folder-reorganizer-ai/releases/tags/v0.2.0 | \ grep "browser_download_url.*messy-folder-reorganizer-ai-v0.2.0-x86_64-apple-darwin.tar.gz" | \ cut -d '"' -f 4 | \ xargs curl -L -o messy-folder-reorganizer-ai-macos-x64.tar.gz
-
Linux x86_64:
curl -s https://api.github.com/repos/PerminovEugene/messy-folder-reorganizer-ai/releases/tags/v0.2.0 | \ grep "browser_download_url.*messy-folder-reorganizer-ai-v0.2.0-x86_64-unknown-linux-gnu.tar.gz" | \ cut -d '"' -f 4 | \ xargs curl -L -o messy-folder-reorganizer-ai-linux-x64.tar.gz
- Extract and install:
-
Apple Silicon (macOS ARM64):
tar -xvzf messy-folder-reorganizer-ai-macos-arm64.tar.gz sudo mv messy-folder-reorganizer-ai /usr/local/bin/messy-folder-reorganizer-ai
-
Intel Mac (macOS x86_64):
tar -xvzf messy-folder-reorganizer-ai-macos-x64.tar.gz sudo mv messy-folder-reorganizer-ai /usr/local/bin/messy-folder-reorganizer-ai
-
Linux x86_64:
tar -xvzf messy-folder-reorganizer-ai-linux-x64.tar.gz sudo mv messy-folder-reorganizer-ai /usr/local/bin/messy-folder-reorganizer-ai
-
Verify the installation:
messy-folder-reorganizer-ai --help
-
Clone the repository:
git clone git@github.com:PerminovEugene/messy-folder-reorganizer-ai.git
-
Build the project:
cargo build --release
-
Run it:
cargo run -- \ -E mxbai-embed-large \ -L deepseek-r1:latest \ -S ./test_cases/clustering/messy-folder \ -D ./test_cases/clustering/structured-folder
messy-folder-reorganizer-ai process \
-E <EMBEDDING_MODEL_NAME> \
-L <LLM_MODEL_NAME> \
-S <SOURCE_FOLDER_PATH> \
-D <DESTINATION_FOLDER_PATH>
messy-folder-reorganizer-ai apply \
-i <SESSION_ID>
messy-folder-reorganizer-ai rollback \
-i <SESSION_ID>
The CLI supports the following subcommands:
Processes source files, finds best-matching destination folders using embeddings, and generates a migration plan.
Argument | Short | Default | Description |
---|---|---|---|
--language-model |
-L |
required | Ollama LLM model name used to generate semantic folder names. |
--embedding-model |
-E |
required | Embedding model used for representing folder and file names as vectors. |
--source |
-S |
required | Path to the folder with unorganized files. |
--destination |
-D |
home |
Path to the folder where organized files should go. |
--recursive |
-R |
false |
Whether to scan subfolders of the source folder recursively. |
--force-apply |
-F |
false |
Automatically apply changes after processing without showing preview. |
--continue-on-fs-errors |
-C |
false |
Allow skipping files/folders that throw filesystem errors (e.g., permission denied). |
--llm-address |
-n |
http://localhost:11434 |
Address of the local or remote Ollama LLM server. |
--qdrant-address |
-q |
http://localhost:6334 |
Address of the Qdrant vector database instance. |
Applies a previously saved migration plan using the session ID.
Session Id will be printed during process
execution.
Argument | Short | Description |
---|---|---|
--session-id |
-i |
The session ID generated by the process command. |
Rolls back a previously applied migration using the session ID.
Session Id will be printed during process
execution.
Argument | Short | Description |
---|---|---|
--session-id |
-i |
The session ID used to identify which migration to undo. |
On the first run, the app creates a .messy-folder-reorganizer-ai/
directory in your home folder containing:
- llm_config.toml β LLM model request configuration options
- embeddings_config.toml β Embedding model request configuration options
- rag_ml_config.toml β RAG and ML behavior settings
Model request configurations are commented out by default and will fall back to built-in values unless edited.
More information about LLM and Embedding model configuration options can be found https://github.com/ollama/ollama/blob/main/docs/modelfile.md#valid-parameters-and-values.
RAG and ML configuration parameters are required and should always be present in rag_ml_config.toml. You also can set up ignore lists for destionation and source pathes in that config file.
You can change the path where .messy-folder-reorganizer-ai
will be created. Simply add MESSY_FOLDER_REORGANIZER_AI_PATH
environment variable with path with desired location.
Prompts are stored in:
~/.messy-folder-reorganizer-ai/prompts/
You can edit these to experiment with different phrasing.
The source file list will be appended automatically, so do not use {}
or other placeholders in the prompt.
Feel free to contribute improved prompts via PR!
If you break or delete any config/prompt files, simply re-run the app β missing files will be regenerated with default values.
-
Run the setup script before contributing:
bash setup-hooks.sh
-
Lint & format code:
cargo clippy cargo fmt
-
Check for unused dependencies:
cargo +nightly udeps
To run all tests
cargo test
To run integration tests
cargo test --test '*' -- --nocapture
To run specific integration test (file_collision for example)
cargo test file_collision -- --nocapture
rm -f /usr/local/bin/messy-folder-reorganizer-ai
rm -rf ~/.messy-folder-reorganizer-ai
This project is dual-licensed under either:
at your option.
It interacts with external services including: