This script downloads all your GitHub repositories, analyzes their code using OpenAI, and automatically updates their topics (tags) on GitHub based on the code content.
- Authenticates with GitHub and OpenAI
- Clones all your (non-fork) repositories
- Analyzes code to suggest relevant topics using GPT-4
- Updates topics for each repo via the GitHub API
- NEW: Optionally only process public repos or repos without tags (topics)
GITHUB_USERNAME=your_github_username
GITHUB_TOKEN=your_github_token
OPENAI_API_KEY=your_openai_api_key
- GITHUB_TOKEN: Needs
repo
scope for private repos and topic editing. - OPENAI_API_KEY: Needs access to GPT-4.
pip install -r requirements.txt
python auto_tag_github_repos.py [--only-public] [--only-untagged]
--only-public
: Only process public repositories (skip private repos)--only-untagged
: Only process repositories that have no topics/tags set- You can use both flags together to process only public, untagged repos
- The script skips forked repositories by default.
- It only scans a sample of files per repo to stay within OpenAI prompt limits.
- Make sure you have
git
installed and available in your PATH.
- Be very careful about your spend by selecting the correct model! Each repo analysis sends code to OpenAI, which can add up quickly in API costs. Monitor your usage and set limits as needed. I wasted money testing by using an expensive model.
- You can check your OpenAI usage and billing at: OpenAI Billing Overview
- Never share your
.env
file or API keys. - The script embeds your GitHub token in the clone URL for non-interactive cloning.
MIT