Skip to content

Commit 964216f

Browse files
author
d4rkc0de
committed
Init commit
Init commit Init commit
0 parents  commit 964216f

37 files changed

+15130
-0
lines changed

.env.md

Lines changed: 74 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,74 @@
1+
The .env file is used to store environment variables for the backend server. it contains two sections: Text Processing Configuration and Image Processing Configuration:
2+
3+
## Text Processing Configuration
4+
5+
These variables are used for document (text) processing:
6+
7+
- TEXT_API_END_POINT: Specifies the API endpoint for text processing.
8+
- TEXT_MODEL_NAME: Defines the model to be used for text processing.
9+
- TEXT_API_KEYS: A list containing the API key(s) required for authentication when making requests to the text API
10+
endpoint. **`Using multiple keys will help in avoiding rate limits.`**
11+
12+
## Image Processing Configuration
13+
14+
These variables are used for image processing:
15+
16+
- IMAGE_API_END_POINT: Specifies the API endpoint for image processing.
17+
- IMAGE_MODEL_NAME: Defines the model used for image processing.
18+
- IMAGE_API_KEYS: A list containing the API key(s) for image processing requests. Using multiple keys will help in avoiding rate limits.
19+
20+
21+
## Examples:
22+
23+
- **OPENAI**
24+
```bash
25+
# API and MODEL used for documents processing
26+
TTEXT_API_END_POINT=https://api.openai.com/v1
27+
TTEXT_MODEL_NAME=gpt-4o
28+
TTEXT_API_KEYS=["sk-xxx","sk-yyy"]
29+
30+
# API and MODEL used for images processing
31+
TIMAGE_API_END_POINT=https://api.openai.com/v1
32+
TIMAGE_MODEL_NAME=gpt-4o
33+
TIMAGE_API_KEYS=["sk-xxx","sk-yyy"]
34+
```
35+
36+
- **GROQ**
37+
```bash
38+
# API and MODEL used for documents processing
39+
TEXT_API_END_POINT=https://api.groq.com/openai/v1
40+
TEXT_MODEL_NAME=llama3-70b-8192
41+
TEXT_API_KEYS=["gsk_xxx","gsk_yyy"]
42+
43+
# API and MODEL used for images processing ( No vision models for GROQ yet)
44+
IMAGE_API_END_POINT=http://localhost:11434/v1
45+
IMAGE_MODEL_NAME=moondream:latest
46+
IMAGE_API_KEYS=["ollama"]
47+
```
48+
49+
- **OLLAMA**
50+
```bash
51+
# API and MODEL used for documents processing
52+
TEXT_API_END_POINT=http://localhost:11434/v1
53+
TEXT_MODEL_NAME=gemma2:latest
54+
TEXT_API_KEYS=["ollama"]
55+
56+
# API and MODEL used for images processing
57+
IMAGE_API_END_POINT=http://localhost:11434/v1
58+
IMAGE_MODEL_NAME=moondream:latest
59+
IMAGE_API_KEYS=["ollama"]
60+
```
61+
62+
63+
- **HUGGING FACE**
64+
```bash
65+
# API and MODEL used for documents processing
66+
TEXT_API_END_POINT=https://api-inference.huggingface.co/v1
67+
TEXT_MODEL_NAME=microsoft/Phi-3-mini-4k-instruct
68+
TEXT_API_KEYS=["hf_xxx","hf_yyy"]
69+
70+
# API and MODEL used for images processing
71+
IMAGE_API_END_POINT=https://api-inference.huggingface.co/v1
72+
IMAGE_MODEL_NAME=nlpconnect/vit-gpt2-image-captioning
73+
IMAGE_API_KEYS=["hf_xxx","hf_yyy"]
74+
```

.gitignore

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,22 @@
1+
# IDE and OS specific files
2+
**/.idea/
3+
**/.vscode/
4+
.DS_Store
5+
Thumbs.db
6+
7+
# Frontend (Angular) specific files
8+
frontend/node_modules/
9+
frontend/dist/
10+
frontend/.angular/
11+
frontend/*.js.map
12+
13+
# Backend (FastAPI) specific files
14+
backend/__pycache__/
15+
backend/*.pyc
16+
backend/*.pyo
17+
backend/*.pyd
18+
backend/venv/
19+
venv/
20+
backend/env/
21+
**/__pycache__/
22+
backend/app/*.db

README.md

Lines changed: 144 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,144 @@
1+
# FileWizardAi
2+
3+
## Description
4+
5+
FileWizardAi is a Python/Angular project designed to automatically organize your files into a well-structured directory
6+
hierarchy and rename them according to their content. This tool is ideal for anyone looking to declutter their digital
7+
workspace by sorting files into appropriate folders and providing descriptive names, making it easier to manage and
8+
locate files. Additionally, it allows you to input a text prompt and instantly searches for files that are related to
9+
your query, providing you with the most relevant files based on the content you provide.
10+
11+
The app also features a caching system to minimize API calls, ensuring that only new or modified files are processed.
12+
13+
### Example:
14+
15+
**Before**
16+
17+
```bash
18+
/home/user
19+
├── Downloads
20+
│ ├── 6.1 Course Curriculum v2.pdf
21+
│ └── trip_paris.txt
22+
│ └── 8d71473c-533f-4ba3-9bce-55d3d9a6662a.jpg
23+
│ └── Screenshot_from_2024-06-10_21-39-24.png
24+
```
25+
26+
**After**
27+
28+
```bash
29+
/home/user/Downloads
30+
├─ docs
31+
│ └─ certifications
32+
│ └─ databricks
33+
│ └─ data_engineer_associate
34+
│ └─ curriculum_v2.pdf
35+
├─ Personal Photos
36+
│ └─ 2024
37+
│ └─ 03
38+
│ └─ 01
39+
│ └─ person_in_black_shirt.jpg
40+
├─ finance-docs
41+
│ └─ trip-expenses
42+
│ └─ paris
43+
│ └─ trip-justification.txt
44+
└─ project Assets
45+
└─ instructions_screenshot.png
46+
```
47+
48+
### Video tutorial:
49+
50+
[![Watch the video](./yt_video_logo.png)](https://www.youtube.com/watch?v=T1rXLox80rM)
51+
52+
53+
## Table of Contents
54+
55+
- [Installation](#installation)
56+
- [Usage](#usage)
57+
- [Run in Development Mode](#run-in-development-mode)
58+
- [Credits](#credits)
59+
- [License](#license)
60+
- [Technical architecture](#technical-architecture)
61+
62+
## Installation
63+
64+
Make sure you have Python installed on your machine.
65+
66+
First, clone the repository:
67+
68+
```bash
69+
git clone https://github.com/AIxHunter/FileWizardAi.git
70+
```
71+
72+
Navigate to the backend folder and update your `.env` file according to the [documentation](.env.md). Then, install the
73+
required
74+
packages by running ( preferably in a virtual environment like venv or conda):
75+
76+
```bash
77+
cd backend
78+
pip install -r requirements.txt
79+
```
80+
81+
## Usage
82+
83+
Run the backend server
84+
85+
```bash
86+
cd backend
87+
uvicorn app.server:app --host localhost --port 8000
88+
```
89+
90+
App will be running under: http://localhost:8000/
91+
92+
## Run in Development Mode
93+
94+
If you are a developper and you want to modify the frontend, you can run the frontend and backend separately, here is
95+
how to do it:
96+
Install Node.js https://nodejs.org/
97+
98+
install Angular CLI:
99+
100+
```bash
101+
npm install -g @angular/cli
102+
```
103+
104+
Run frontend:
105+
106+
```bash
107+
cd frontend
108+
npm install
109+
ng serve
110+
```
111+
112+
The frontend will be available at `http://localhost:4200`.
113+
114+
to package the frontend run:
115+
116+
```bash
117+
ng build --base-href static/
118+
```
119+
120+
Run backend:
121+
122+
Update your `.env` file with the desired API settings (check the [documentation](.env.md)), then:
123+
124+
```bash
125+
cd backend
126+
uvicorn app.server:app --host localhost --port 8000 --reload
127+
```
128+
129+
## Technical architecture
130+
131+
<img src="filewizardai_architecture.png" alt="Online Image" width="600"/>
132+
133+
1. Send request from Angular frontend (ex, organize files)
134+
2. Backend receives request through a REST API of FastAPI.
135+
3. Check SQLite if files has already been processed (cached files).
136+
4. Return cached summary if file was already processed.
137+
5. If the file has not been processed before, send new file to LLM for summarization.
138+
6. Cache summary in SQLite.
139+
7. Return summary to Angular frontend.
140+
8. Display summary to user and perform actions if need it.
141+
142+
## License
143+
144+
This project is licensed under the MIT License.

backend/.env

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
# API and MODEL used for documents processing
2+
TEXT_API_END_POINT=https://api.groq.com/openai/v1
3+
TEXT_MODEL_NAME=llama3-70b-8192
4+
TEXT_API_KEYS=["gsk_xxx"]
5+
6+
# API and MODEL used for images processing
7+
IMAGE_API_END_POINT=http://localhost:11434/v1
8+
IMAGE_MODEL_NAME=moondream:latest
9+
IMAGE_API_KEYS=["ollama"] # Required but not used

backend/app/__init__.py

Whitespace-only changes.

backend/app/database.py

Lines changed: 63 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,63 @@
1+
import sqlite3
2+
3+
4+
class SQLiteDB:
5+
def __init__(self):
6+
self.conn = sqlite3.connect('FileWizardAi.db')
7+
self.cursor = self.conn.cursor()
8+
create_table_query = "CREATE TABLE IF NOT EXISTS files_summary (file_path TEXT PRIMARY KEY,file_hash TEXT NOT NULL,summary TEXT)"
9+
self.cursor.execute(create_table_query)
10+
self.conn.commit()
11+
12+
def select(self, table_name, where_clause=None):
13+
sql = f"SELECT * FROM {table_name}"
14+
if where_clause:
15+
sql += f" WHERE {where_clause}"
16+
self.cursor.execute(sql)
17+
return self.cursor.fetchall()
18+
19+
def is_file_exist(self, file_path, file_hash):
20+
self.cursor.execute("SELECT * FROM files_summary WHERE file_path = ? AND file_hash = ?", (file_path, file_hash))
21+
file = self.cursor.fetchone()
22+
return bool(file)
23+
24+
def insert_file_summary(self, file_path, file_hash, summary):
25+
c = self.conn.cursor()
26+
c.execute("SELECT * FROM files_summary WHERE file_path=?", (file_path,))
27+
user_exists = c.fetchone()
28+
29+
if user_exists:
30+
c.execute("UPDATE files_summary SET file_hash=?, summary=? WHERE file_path=?",
31+
(file_hash, summary, file_path))
32+
else:
33+
c.execute("INSERT INTO files_summary (file_path, file_hash, summary) VALUES (?, ?, ?)",
34+
(file_path, file_hash, summary))
35+
self.conn.commit()
36+
37+
def get_file_summary(self, file_path):
38+
self.cursor.execute("SELECT summary FROM files_summary WHERE file_path = ?", (file_path,))
39+
result = self.cursor.fetchone()
40+
return result[0] if result else None
41+
42+
def drop_table(self):
43+
self.cursor.execute("DROP TABLE IF EXISTS files_summary")
44+
self.conn.commit()
45+
46+
def get_all_files(self):
47+
self.cursor.execute("SELECT file_path FROM files_summary")
48+
results = self.cursor.fetchall()
49+
files_path = [row[0] for row in results]
50+
return files_path
51+
52+
def update_file(self, old_file_path, new_file_path, new_hash):
53+
self.cursor.execute("UPDATE files_summary SET file_path = ?, file_hash = ? WHERE file_path = ?",
54+
(new_file_path, new_hash, old_file_path))
55+
self.conn.commit()
56+
57+
def delete_records(self, file_paths):
58+
placeholders = ",".join("?" * len(file_paths))
59+
self.cursor.execute(f"DELETE FROM files_summary WHERE file_path IN ({placeholders})", file_paths)
60+
self.conn.commit()
61+
62+
def close(self):
63+
self.conn.close()

0 commit comments

Comments
 (0)