Skip to content

Process local folder #35

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Dec 4, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
95 changes: 31 additions & 64 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,77 +1,44 @@
# Python
# Byte-compiled files
*.pyc
*.pyo
*.pyd
__pycache__/
*.env
.venv/
env/
venv/
ENV/
*.egg-info/
*.egg
pip-log.txt

# Flask
instance/
*.sqlite3
*.db
*.bak
instance/config.py
instance/*.secret

# Environment Variables
.env
.env.*

# Logs
logs/
*.log
*.out
*.err
# Virtual environments
venv/
env/
.virtualenv/

# Pytest and Coverage
.pytest_cache/
.coverage
.tox/
nosetests.xml
coverage.xml
*.cover
.cache
# Distribution/build files
build/
dist/
*.egg-info/
.eggs/

# IDEs and Editors
# IDE settings
.vscode/
.idea/.env
.vercel

# ignore the .pyc files
*.pyc

dist/
# Node (if using npm/yarn for assets)
node_modules/
*.lock
.idea/
*.swp

# Static assets (if generated)
static/
dist/
build/
# Logs and debugging
*.log
*.trace

# Docker
*.pid
docker-compose.override.yml
# OS-specific files
.DS_Store
Thumbs.db

# Heroku
*.buildpacks
*.env.local
*.env.production
*.env.*.local
# Testing and coverage
htmlcov/
*.cover
.coverage
.cache/
pytest_cache/

# Miscellaneous
*.bak
*.tmp
*.log.*
Thumbs.db
# Jupyter Notebook checkpoints
.ipynb_checkpoints/

# Vercel
.vercel
# Custom settings
.env
*.sqlite3
.vercel
37 changes: 28 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,25 +5,44 @@
## Getting Started
[Live Demo](https://code-graph.falkordb.com/)

## Running locally

### Run FalkorDB
Free cloud instance: https://app.falkordb.cloud/signup

Or by running locally with docker:
```bash
flask --app code_graph run --debug
docker run -p 6379:6379 -p 3000:3000 -it --rm falkordb/falkordb:latest
```

Process local git repository, ignoring specific folder(s)
### Config
Create your own `.env` file from the `.env.template` file

Start the server:
```bash
curl -X POST http://127.0.0.1:5000/process_local_repo -H "Content-Type: application/json" -d '{"repo": "/Users/roilipman/Dev/FalkorDB", "ignore": ["./.github", "./sbin", "./.git","./deps", "./bin", "./build"]}'
flask --app api/index.py run --debug
```

Process code coverage

### Creating a graph
Process a local source folder:
```bash
curl -X POST http://127.0.0.1:5000/process_code_coverage -H "Content-Type: application/json" -d '{"lcov": "/Users/roilipman/Dev/code_graph/code_graph/code_coverage/lcov/falkordb.lcov", "repo": "FalkorDB"}'
curl -X POST http://127.0.0.1:5000/analyze_folder -H "Content-Type: application/json" -d '{"path": "<FULL_PATH_TO_FOLDER>", "ignore": [<OPTIONAL_IGNORE_LIST>]}' -H "Authorization: <.ENV_SECRET_TOKEN>"
```

Process git information

For example:
```bash
curl -X POST http://127.0.0.1:5000/process_git_history -H "Content-Type: application/json" -d '{"repo": "/Users/roilipman/Dev/falkorDB"}'
curl -X POST http://127.0.0.1:5000/analyze_folder -H "Content-Type: application/json" -d '{"path": "/Users/roilipman/Dev/GraphRAG-SDK", "ignore": ["./.github", "./build"]}' -H "Authorization: OpenSesame"
```
## Working with your graph
Once the source code analysis completes your FalkorDB DB will be populated with
a graph representation of your source code, the graph name should be the same as
the name of the folder you've requested to analyze, for the example above a graph named:
"GraphRAG-SDK".

At the moment only the Python and C languages are supported, we do intend to support additional languages.

At this point you can explore and query your source code using various tools
Here are several options:
1. FalkorDB built-in UI
2. One of FalkorDB's [clients](https://docs.falkordb.com/clients.html)
3. Use FalkorDB [GraphRAG-SDK](https://github.com/FalkorDB/GraphRAG-SDK) to connect an LLM for natural language exploration.
6 changes: 4 additions & 2 deletions api/analyzers/source_analyzer.py
Original file line number Diff line number Diff line change
Expand Up @@ -135,7 +135,7 @@ def analyze_sources(self, ignore: List[str]) -> None:
# Second pass analysis of the source code
self.second_pass(ignore, executor)

def analyze(self, path: str, g: Graph, ignore: Optional[List[str]] = []) -> None:
def analyze_local_folder(self, path: str, g: Graph, ignore: Optional[List[str]] = []) -> None:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Fix mutable default argument

Using mutable default arguments in Python can lead to unexpected behavior. The empty list [] as a default argument is shared across function calls.

Replace with:

-    def analyze_local_folder(self, path: str, g: Graph, ignore: Optional[List[str]] = []) -> None:
+    def analyze_local_folder(self, path: str, g: Graph, ignore: Optional[List[str]] = None) -> None:

And add at the beginning of the method:

        if ignore is None:
            ignore = []
🧰 Tools
🪛 Ruff (0.8.0)

138-138: Do not use mutable data structures for argument defaults

Replace with None; initialize within function

(B006)

"""
Analyze path.

Expand All @@ -144,6 +144,8 @@ def analyze(self, path: str, g: Graph, ignore: Optional[List[str]] = []) -> None
ignore (List(str)): List of paths to skip
"""

logging.info(f"Analyzing local folder {path}")

# Save original working directory for later restore
original_dir = Path.cwd()

Expand Down Expand Up @@ -179,4 +181,4 @@ def analyze_local_repository(self, path: str, ignore: Optional[List[str]] = [])
self.graph.set_graph_commit(head.hexsha)

return self.graph

55 changes: 55 additions & 0 deletions api/index.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
import os
import datetime
from api import *
from pathlib import Path
Comment on lines 3 to +4
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Replace star import with explicit imports

Star imports (from api import *) make it unclear which symbols are being imported and can lead to naming conflicts.

Replace with explicit imports:

-from api import *
+from api.analyzers.source_analyzer import SourceAnalyzer
+from api.graph import Graph
+from api.utils import graph_exists, get_repos, get_repo_info, ask

Committable suggestion skipped: line range outside the PR's diff.

🧰 Tools
🪛 Ruff (0.8.0)

3-3: from api import * used; unable to detect undefined names

(F403)

from typing import Optional
from functools import wraps
from falkordb import FalkorDB
Expand Down Expand Up @@ -309,3 +310,57 @@ def chat():
response = { 'status': 'success', 'response': answer }

return jsonify(response), 200

@app.route('/analyze_folder', methods=['POST'])
@token_required # Apply token authentication decorator
def analyze_folder():
"""
Endpoint to analyze local source code
Expects 'path' and optionally an ignore list.

Returns:
JSON response with status and error message if applicable
Status codes:
200: Success
400: Invalid input
500: Internal server error
"""

# Get JSON data from the request
data = request.get_json()

# Get query parameters
path = data.get('path')
ignore = data.get('ignore', [])

# Validate input parameters
if not path:
logging.error("'path' is missing from the request.")
return jsonify({"status": "'path' is required."}), 400

# Validate path exists and is a directory
if not os.path.isdir(path):
logging.error(f"Path '{path}' does not exist or is not a directory")
return jsonify({"status": "Invalid path: must be an existing directory"}), 400

# Validate ignore list contains valid paths
if not isinstance(ignore, list):
logging.error("'ignore' must be a list of paths")
return jsonify({"status": "'ignore' must be a list of paths"}), 400

proj_name = Path(path).name

Comment on lines +351 to +352
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add validation for project name

The project name is derived from the path without validation, which could lead to issues with special characters or empty names.

Add validation:

-    proj_name = Path(path).name
+    proj_name = Path(path).name
+    if not proj_name or proj_name.startswith('.'):
+        logging.error(f"Invalid project name derived from path: {proj_name}")
+        return jsonify({"status": "Invalid path: cannot derive valid project name"}), 400
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
proj_name = Path(path).name
proj_name = Path(path).name
if not proj_name or proj_name.startswith('.'):
logging.error(f"Invalid project name derived from path: {proj_name}")
return jsonify({"status": "Invalid path: cannot derive valid project name"}), 400

# Initialize the graph with the provided project name
g = Graph(proj_name)

# Analyze source code within given folder
analyzer = SourceAnalyzer()
analyzer.analyze_local_folder(path, g, ignore)

# Return response
response = {
'status': 'success',
'project': proj_name
}
return jsonify(response), 200

Loading