Skip to content

Commit e8b4442

Browse files
Mustafa-Esoofallydirkbrndysolanky
authored
GitHub repo analyzer (#2582)
## Description - **Summary of changes**: Describe the key changes in this PR and their purpose. - **Related issues**: Mention if this PR fixes or is connected to any issues. - **Motivation and context**: Explain the reason for the changes and the problem they solve. - **Environment or dependencies**: Specify any changes in dependencies or environment configurations required for this update. - **Impact on metrics**: (If applicable) Describe changes in any metrics or performance benchmarks. Fixes # (issue) --- ## Type of change Please check the options that are relevant: - [ ] Bug fix (non-breaking change which fixes an issue) - [x] New feature (non-breaking change which adds functionality) - [ ] Breaking change (fix or feature that would cause existing functionality to not work as expected) - [ ] Model update (Addition or modification of models) - [ ] Other (please describe): --- ## Checklist - [x] Adherence to standards: Code complies with Agno’s style guidelines and best practices. - [x] Formatting and validation: You have run `./scripts/format.sh` and `./scripts/validate.sh` to ensure code is formatted and linted. - [x] Self-review completed: A thorough review has been performed by the contributor(s). - [x] Documentation: Docstrings and comments have been added or updated for any complex logic. - [x] Examples and guides: Relevant cookbook examples have been included or updated (if applicable). - [x] Tested in a clean environment: Changes have been tested in a clean environment to confirm expected behavior. - [ ] Tests (optional): Tests have been added or updated to cover any new or changed functionality. --- ## Additional Notes Include any deployment notes, performance implications, security considerations, or other relevant information (e.g., screenshots or logs if applicable). --------- Co-authored-by: Dirk Brand <dirkbrnd@gmail.com> Co-authored-by: Yash Pratap Solanky <101447028+ysolanky@users.noreply.github.com> Co-authored-by: ysolanky <yash@phidata.com>
1 parent cd3d0dc commit e8b4442

File tree

12 files changed

+785
-394
lines changed

12 files changed

+785
-394
lines changed
Lines changed: 64 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,64 @@
1+
# GitHub Repository Analyzer
2+
3+
This application provides a chat-based interface to interact with and analyze GitHub repositories using the Agno framework and OpenAI models. Users can select a repository and ask questions about its code, issues, pull requests, statistics, and more.
4+
5+
## Features
6+
7+
- **Chat Interface:** Interact with an AI agent knowledgeable about a selected GitHub repository.
8+
- **Repository Selection:** Choose from a predefined list of popular open-source repositories or potentially add your own (requires code modification or environment setup).
9+
- **Comprehensive Analysis:** Ask about:
10+
- Repository statistics (stars, forks, languages).
11+
- Open/Closed issues and pull requests.
12+
- Detailed pull request information, including code changes (diff/patch analysis).
13+
- File contents and directory structures.
14+
- Code searching within the repository.
15+
- **Powered by Agno & OpenAI:** Leverages the `agno` framework for agent creation and tool usage.
16+
17+
### 1. Create a virtual environment
18+
19+
```shell
20+
python3 -m venv .venv
21+
source .venv/bin/activate
22+
```
23+
24+
### 2. Install dependencies
25+
26+
```shell
27+
pip install -r cookbook/examples/apps/github_repo_analyzer/requirements.txt
28+
```
29+
30+
### 3. Export API Keys
31+
32+
Export the API keys:
33+
34+
```shell
35+
export OPENAI_API_KEY=***
36+
export GITHUB_ACCESS_TOKEN=**
37+
```
38+
39+
### 4. Run the app
40+
41+
```shell
42+
streamlit run cookbook/examples/apps/github_repo_analyzer/app.py
43+
```
44+
45+
Navigate to the URL provided by Streamlit (usually `http://localhost:8501`) in your web browser. Select a repository from the sidebar and start chatting!
46+
47+
## Project Structure
48+
49+
The project uses a streamlined structure with all functionality in a single file:
50+
51+
```
52+
github-repo-analyzer/
53+
├── app.py # Main application with all functionality
54+
├── agent.py # Agent initialization
55+
├── requirements.txt # Dependencies
56+
├── README.md # Documentation
57+
└── output/ # Generated analysis reports
58+
```
59+
60+
## Technologies Used
61+
62+
- [Agno](https://docs.agno.com) - AI agent framework for GitHub analysis
63+
- [Streamlit](https://streamlit.io/) - Interactive web interface
64+
- [PyGithub](https://pygithub.readthedocs.io/) - GitHub API access
Lines changed: 77 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,77 @@
1+
from textwrap import dedent
2+
from typing import Optional
3+
4+
from agno.agent import Agent
5+
from agno.models.openai import OpenAIChat
6+
from agno.tools.github import GithubTools
7+
8+
9+
def get_github_agent(debug_mode: bool = True) -> Optional[Agent]:
10+
"""
11+
Args:
12+
repo_name: Optional repository name ("owner/repo"). If None, agent relies on user query.
13+
debug_mode: Whether to enable debug mode for tool calls.
14+
"""
15+
16+
return Agent(
17+
model=OpenAIChat(id="gpt-4.1"),
18+
description=dedent("""
19+
You are an expert Code Reviewing Agent specializing in analyzing GitHub repositories,
20+
with a strong focus on detailed code reviews for Pull Requests.
21+
Use your tools to answer questions accurately and provide insightful analysis.
22+
"""),
23+
instructions=dedent(f"""\
24+
**Core Task:** Analyze GitHub repositories and answer user questions based on the available tools and conversation history.
25+
26+
**Repository Context Management:**
27+
1. **Context Persistence:** Once a target repository (owner/repo) is identified (either initially or from a user query like 'analyze owner/repo'), **MAINTAIN THAT CONTEXT** for all subsequent questions in the current conversation unless the user clearly specifies a *different* repository.
28+
2. **Determining Context:** If no repository is specified in the *current* user query, **CAREFULLY REVIEW THE CONVERSATION HISTORY** to find the most recently established target repository. Use that repository context.
29+
3. **Accuracy:** When extracting a repository name (owner/repo) from the query or history, **BE EXTREMELY CAREFUL WITH SPELLING AND FORMATTING**. Double-check against the user's exact input.
30+
4. **Ambiguity:** If no repository context has been established in the conversation history and the current query doesn't specify one, **YOU MUST ASK THE USER** to clarify which repository (using owner/repo format) they are interested in before using tools that require a repository name.
31+
32+
**How to Answer Questions:**
33+
* **Identify Key Information:** Understand the user's goal and the target repository (using the context rules above).
34+
* **Select Appropriate Tools:** Choose the best tool(s) for the task, ensuring you provide the correct `repo_name` argument (owner/repo format, checked for accuracy) if required by the tool.
35+
* Project Overview: `get_repository`, `get_file_content` (for README.md).
36+
* Libraries/Dependencies: `get_file_content` (for requirements.txt, pyproject.toml, etc.), `get_directory_content`, `search_code`.
37+
* PRs/Issues: Use relevant PR/issue tools.
38+
* List User Repos: `list_repositories` (no repo_name needed).
39+
* Search Repos: `search_repositories` (no repo_name needed).
40+
* **Execute Tools:** Run the selected tools.
41+
* **Synthesize Answer:** Combine tool results into a clear, concise answer using markdown. If a tool fails (e.g., 404 error because the repo name was incorrect), state that you couldn't find the specified repository and suggest checking the name.
42+
* **Cite Sources:** Mention specific files (e.g., "According to README.md...").
43+
44+
**Specific Analysis Areas (Most require a specific repository):**
45+
* Issues: Listing, summarizing, searching.
46+
* Pull Requests (PRs): Listing, summarizing, searching, getting details/changes.
47+
* Code & Files: Searching code, getting file content, listing directory contents.
48+
* Repository Stats & Activity: Stars, contributors, recent activity.
49+
50+
**Code Review Guidelines (Requires repository and PR):**
51+
* Fetch Changes: Use `get_pull_request_changes` or `get_pull_request_with_details`.
52+
* Analyze Patch: Evaluate based on functionality, best practices, style, clarity, efficiency.
53+
* Present Review: Structure clearly, cite lines/code, be constructive.
54+
"""),
55+
tools=[
56+
GithubTools(
57+
get_repository=True,
58+
search_repositories=True,
59+
get_pull_request=True,
60+
get_pull_request_changes=True,
61+
list_branches=True,
62+
get_pull_request_count=True,
63+
get_pull_requests=True,
64+
get_pull_request_comments=True,
65+
get_pull_request_with_details=True,
66+
list_issues=True,
67+
get_issue=True,
68+
update_file=True,
69+
get_file_content=True,
70+
get_directory_content=True,
71+
search_code=True,
72+
),
73+
],
74+
markdown=True,
75+
debug_mode=debug_mode,
76+
add_history_to_messages=True,
77+
)
Lines changed: 153 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,153 @@
1+
from os import getenv
2+
3+
import nest_asyncio
4+
import streamlit as st
5+
from agents import get_github_agent
6+
from agno.agent import Agent
7+
from agno.utils.log import logger
8+
from utils import (
9+
CUSTOM_CSS,
10+
about_widget,
11+
add_message,
12+
display_tool_calls,
13+
sidebar_widget,
14+
)
15+
16+
nest_asyncio.apply()
17+
st.set_page_config(
18+
page_title="GitHub Repo Analyzer",
19+
page_icon="👨‍💻",
20+
layout="wide",
21+
)
22+
23+
# Load custom CSS with dark mode support
24+
st.markdown(CUSTOM_CSS, unsafe_allow_html=True)
25+
26+
27+
def main() -> None:
28+
#####################################################################
29+
# App header
30+
####################################################################
31+
st.markdown(
32+
"<h1 class='main-header'>👨‍💻 GitHub Repo Analyzer</h1>", unsafe_allow_html=True
33+
)
34+
st.markdown("Analyze GitHub repositories")
35+
36+
####################################################################
37+
# Initialize Agent
38+
####################################################################
39+
github_agent: Agent
40+
if (
41+
"github_agent" not in st.session_state
42+
or st.session_state["github_agent"] is None
43+
):
44+
logger.info("---*--- Creating new Github agent ---*---")
45+
github_agent = get_github_agent()
46+
st.session_state["github_agent"] = github_agent
47+
st.session_state["messages"] = []
48+
st.session_state["github_token"] = getenv("GITHUB_ACCESS_TOKEN")
49+
else:
50+
github_agent = st.session_state["github_agent"]
51+
52+
####################################################################
53+
# Load Agent Session from the database
54+
####################################################################
55+
try:
56+
st.session_state["github_agent_session_id"] = github_agent.load_session()
57+
except Exception:
58+
st.warning("Could not create Agent session, is the database running?")
59+
return
60+
61+
####################################################################
62+
# Load runs from memory (v2 Memory) only on initial load
63+
####################################################################
64+
if github_agent.memory is not None and not st.session_state.get("messages"):
65+
session_id = st.session_state.get("github_agent_session_id")
66+
# Fetch stored runs for this session
67+
agent_runs = github_agent.memory.get_runs(session_id)
68+
if agent_runs:
69+
logger.debug("Loading run history")
70+
st.session_state["messages"] = []
71+
for run_response in agent_runs:
72+
# Iterate through stored messages in the run
73+
for msg in run_response.messages or []:
74+
if msg.role in ["user", "assistant"] and msg.content is not None:
75+
# Include any tool calls attached to this message
76+
add_message(
77+
msg.role, msg.content, getattr(msg, "tool_calls", None)
78+
)
79+
else:
80+
logger.debug("No run history found")
81+
st.session_state["messages"] = []
82+
83+
####################################################################
84+
# Sidebar
85+
####################################################################
86+
sidebar_widget()
87+
88+
####################################################################
89+
# Get user input
90+
####################################################################
91+
if prompt := st.chat_input("👋 Ask me about GitHub repositories!"):
92+
add_message("user", prompt)
93+
94+
####################################################################
95+
# Display chat history
96+
####################################################################
97+
for message in st.session_state["messages"]:
98+
if message["role"] in ["user", "assistant"]:
99+
_content = message["content"]
100+
if _content is not None:
101+
with st.chat_message(message["role"]):
102+
# Display tool calls if they exist in the message
103+
if "tool_calls" in message and message["tool_calls"]:
104+
display_tool_calls(st.empty(), message["tool_calls"])
105+
st.markdown(_content)
106+
107+
####################################################################
108+
# Generate response for user message
109+
####################################################################
110+
last_message = (
111+
st.session_state["messages"][-1] if st.session_state["messages"] else None
112+
)
113+
if last_message and last_message.get("role") == "user":
114+
question = last_message["content"]
115+
with st.chat_message("assistant"):
116+
# Create container for tool calls
117+
tool_calls_container = st.empty()
118+
resp_container = st.empty()
119+
with st.spinner("🤔 Thinking..."):
120+
response = ""
121+
try:
122+
# Run the agent and stream the response
123+
run_response = github_agent.run(
124+
question, stream=True, stream_intermediate_steps=True
125+
)
126+
for _resp_chunk in run_response:
127+
# Display tool calls if available
128+
if _resp_chunk.tools and len(_resp_chunk.tools) > 0:
129+
display_tool_calls(tool_calls_container, _resp_chunk.tools)
130+
131+
# Display response if available and event is RunResponse
132+
if (
133+
_resp_chunk.event == "RunResponse"
134+
and _resp_chunk.content is not None
135+
):
136+
response += _resp_chunk.content
137+
resp_container.markdown(response)
138+
139+
add_message("assistant", response, github_agent.run_response.tools)
140+
except Exception as e:
141+
logger.exception(e)
142+
error_message = f"Sorry, I encountered an error: {str(e)}"
143+
add_message("assistant", error_message)
144+
st.error(error_message)
145+
146+
####################################################################
147+
# About section
148+
####################################################################
149+
about_widget()
150+
151+
152+
if __name__ == "__main__":
153+
main()
Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
#!/bin/bash
2+
3+
############################################################################
4+
# Generate requirements.txt from requirements.in
5+
############################################################################
6+
7+
echo "Generating requirements.txt"
8+
9+
CURR_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
10+
11+
UV_CUSTOM_COMPILE_COMMAND="./generate_requirements.sh" \
12+
uv pip compile ${CURR_DIR}/requirements.in --no-cache --upgrade -o ${CURR_DIR}/requirements.txt
Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
# Direct dependencies for the GitHub Repo Chat App
2+
agno>=0.1.0
3+
PyGithub>=2.1.1
4+
python-dotenv>=1.0.0
5+
matplotlib>=3.7.2
6+
pandas>=2.0.3
7+
streamlit>=1.24.0
8+
openai>=1.67.0

0 commit comments

Comments
 (0)