Skip to content

diarm001/Github_Star_Analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 

Repository files navigation

Starred GitHub Repo Analysis

Ever wonder what patterns lie within your GitHub stars? This project is a simple but powerful tool to automatically fetch and analyze your starred repositories, revealing your core interests and tech focus through clustering.

Turn a long list of repositories into a clear, high-level overview of your development habits.

✨ Features Automated Data Collection: Fetches your complete list of starred repositories from the GitHub API, handling pagination automatically.

Intelligent Clustering: Uses machine learning to group repositories into distinct clusters based on their topics.

Insightful Output: Generates a clean text file outlining each cluster, its most common topics, and the repositories within it.

🚀 Getting Started Prerequisites

You'll need Python 3 and a few libraries to run these scripts. You can install them with pip:

pip install requests pandas scikit-learn

You also need a GitHub Personal Access Token (PAT) with the public_repo scope.

Step 1: Gather Your Data

First, run the data collection script to fetch your starred repos. It will prompt you for your GitHub username and PAT.

python github_star_analysis.py

This will create a YOUR_USERNAME_starred_repos.json file in your directory.

Step 2: Analyze and Cluster

Next, run the analysis script. It will read your JSON file, perform the clustering, and output the results to a text file.

python github_star_cluster.py

The script will ask you how many clusters you'd like to create (e.g., 5-10). The final output will be saved to clustering_analysis_YOUR_USERNAME.txt.

Example Output

Here's a glimpse of what the analysis looks like:

Clustering complete. Found 10 clusters:

--- Cluster 1 --- Most common topics: rust, webassembly, kafka, cryptography, emscripten

Repositories in this cluster:

  • alexpasmantier/television
  • prefix-dev/pixi
  • gosub-io/gosub-engine ...

🧠 Insights The clusters reveal your primary areas of interest. You can use this analysis to:

Identify Your Niche: Confirm your focus in areas like AI, data science, or web development.

Discover Connections: Find unexpected links between different projects you've starred.

Plan Your Learning: Use the clusters as a guide for what topics to explore next.

This is just the start—the raw JSON data is a goldmine for even deeper analysis!

About

Python scripts to extract and analyse your GitHub stars.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages