GitHub Organization Repository Cloner - Automatically clone and audit all repositories from a GitHub organization with comprehensive standards compliance checking.
A powerful bash script for cloning and auditing all repositories from a GitHub organization. Automatically organizes repositories into public and private directories, with comprehensive sanity checks for repository standards compliance.
- Mass Repository Cloning: Clone all accessible repositories from any GitHub organization
- Smart Organization: Automatically separates public and private repositories into dedicated directories
- Automatic Updates: Existing repositories are automatically updated with
git pull
- Authentication Aware: Works with or without GitHub authentication (limited to public repos when unauthenticated)
- Comprehensive Sanity Checks: Audit repositories for standard files and best practices with line-by-line output
- Flexible Configuration: Environment-based configuration for easy customization
- Robust Error Handling: Graceful handling of failed clones with HTTPS fallback
- Colored Output: Clear, colored terminal output for better visibility
-
Install Prerequisites:
# Install GitHub CLI brew install gh # macOS # or sudo apt install gh # Ubuntu/Debian # Install jq for JSON parsing brew install jq # macOS # or sudo apt install jq # Ubuntu/Debian
-
Configure the Script:
# Create configuration file cp config.env.example config.env # Edit with your organization name vim config.env
-
Run the Script:
# Clone all repositories ./gh_repo_cloner.sh # Or perform sanity checks ./gh_repo_cloner.sh --sanity-check
- GitHub CLI (
gh
) - For repository listing and authentication - Git - For cloning repositories
- jq - For JSON parsing
- Bash 4.0+ - For script execution
Create a config.env
file with the following variables:
# Required: Organization name
ORG="your-organization-name"
# Optional: Directory paths (defaults shown)
PUB_DIR="./pub"
PRIV_DIR="./priv"
# Optional: Colors for output (defaults shown)
RED='\033[0;31m'
GREEN='\033[0;32m'
YELLOW='\033[1;33m'
BLUE='\033[0;34m'
NC='\033[0m' # No Color
Variable | Description | Default | Required |
---|---|---|---|
ORG |
GitHub organization name | - | Yes |
PUB_DIR |
Directory for public repositories | ./pub |
No |
PRIV_DIR |
Directory for private repositories | ./priv |
No |
RED , GREEN , etc. |
Terminal colors | ANSI codes | No |
# Show help
./gh_repo_cloner.sh --help
# Clone all repositories
./gh_repo_cloner.sh
# Perform sanity checks on repositories
./gh_repo_cloner.sh --sanity-check
Option | Description |
---|---|
-s, --sanity-check |
Perform sanity checks on repositories for common files |
-h, --help |
Show help message and exit |
The script can audit repositories for compliance with common standards and best practices:
Category | Files/Directories |
---|---|
License | LICENSE , LICENSE.txt , LICENSE.md , COPYING , COPYRIGHT (with content validation) |
Documentation | README.md , README.txt , README |
Changelog | CHANGELOG.md , HISTORY.md , RELEASES.md |
Contributing | CONTRIBUTING.md , CONTRIBUTING.txt |
Security | SECURITY.md , SECURITY.txt |
Code of Conduct | CODE_OF_CONDUCT.md |
Git Configuration | .gitignore |
Editor Configuration | .editorconfig |
Documentation Directory | docs/ , documentation/ |
GitHub Templates | .github/ISSUE_TEMPLATE/ , .github/PULL_REQUEST_TEMPLATE.md |
The script automatically detects various CI/CD configurations:
- GitHub Actions -
.github/workflows/
- GitLab CI -
.gitlab-ci.yml
- Travis CI -
.travis.yml
- Jenkins -
Jenkinsfile
- CircleCI -
.circleci/
- Azure Pipelines -
azure-pipelines.yml
- Buildkite -
.buildkite/
- Bitbucket Pipelines -
bitbucket-pipelines.yml
The script goes beyond just checking for the presence of a LICENSE file - it also validates that the license has been properly filled out. It detects common template placeholders that indicate an incomplete license:
Template Placeholders Detected:
<year>
,[year]
,YYYY
- Year placeholders<name of author>
,<author>
,<owner>
- Author placeholders<name of copyright owner>
,<copyright holders>
- Copyright placeholdersCOPYRIGHT_HOLDER
,AUTHOR_NAME
,YOUR_NAME
,YOUR NAME
- Common template variables
LICENSE File Variants Detected:
- Standard names:
LICENSE
,LICENSE.txt
,LICENSE.md
,LICENSE.rst
- Case variations:
license
,License
- Alternative names:
COPYING
,COPYRIGHT
(common in some projects) - All checked with proper file type validation (not directories or symlinks)
LICENSE Status Indicators:
- ✓ LICENSE - File present and properly filled out
- ⚠ LICENSE (contains template placeholders) - File present but needs customization
- ✗ LICENSE - File missing entirely
[INFO] GitHub Organization Repository Cloner
[INFO] ======================================
[SUCCESS] Authenticated with GitHub
[INFO] Authenticated as: username
[INFO] Organization: awesome-org
[INFO] Found 25 repositories
[INFO] Repository awesome-project already exists, updating...
[SUCCESS] ✓ Updated awesome-project in ./pub/
[SUCCESS] ✓ Cloned new-secret-sauce to ./priv/
[ERROR] ✗ Failed to update modified-repo (may have local changes or connection issues)
[SUCCESS] Cloning completed!
[INFO] Summary:
[INFO] Public repositories cloned: 12
[INFO] Private repositories cloned: 8
[WARNING] Failed to clone: 5
[INFO] Checking public repositories in ./pub:
awesome-project:
✓ LICENSE
✓ CHANGELOG
✓ CONTRIBUTING
✓ README
✓ GITIGNORE
✓ SECURITY
✓ CODE_OF_CONDUCT
✓ EDITORCONFIG
✓ DOCS
✓ ISSUE_TEMPLATES
✓ PR_TEMPLATE
✓ CI/CD
legacy-tool:
⚠ LICENSE (contains template placeholders)
✗ CHANGELOG
✗ CONTRIBUTING
✓ README
✓ GITIGNORE
✗ SECURITY
✗ CODE_OF_CONDUCT
✗ EDITORCONFIG
✗ DOCS
✗ ISSUE_TEMPLATES
✗ PR_TEMPLATE
✓ CI/CD
[INFO] Sanity Check Summary:
[INFO] =====================
[INFO] Total repositories checked: 25
[SUCCESS] Repositories with all files: 8
[WARNING] Repositories missing files: 17
[INFO] Legend:
[INFO] ✓ = File/directory present and complete
[INFO] ✗ = File/directory missing
[INFO] ⚠ = LICENSE present but contains template placeholders
# Login with GitHub CLI
gh auth login
# Check authentication status
gh auth status
Authentication | Public Repos | Private Repos | Rate Limits |
---|---|---|---|
Authenticated | Full access | Access based on permissions | 5,000/hour |
Not authenticated | Read-only access | No access | 60/hour |
The script includes robust error handling:
- SSH to HTTPS Fallback: Automatically retries failed SSH clones using HTTPS
- Existing Repository Updates: Automatically pulls latest changes for existing repositories
- Permission Validation: Clear error messages for access issues
- Rate Limit Awareness: Warns about API rate limits for unauthenticated users
After running the script, your directory structure will look like:
project-root/
├── gh_repo_cloner.sh
├── config.env
├── pub/
│ ├── public-repo-1/
│ ├── public-repo-2/
│ └── ...
└── priv/
├── private-repo-1/
├── private-repo-2/
└── ...
# Test configuration loading
./gh_repo_cloner.sh --help
# Test authentication check
gh auth status
# Dry run sanity checks
./gh_repo_cloner.sh --sanity-check
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature
) - Commit your changes (
git commit -m 'Add amazing feature'
) - Push to the branch (
git push origin feature/amazing-feature
) - Open a Pull Request
- Regular Audits: Run sanity checks monthly to ensure repository standards
- Standardize Templates: Use the script to identify repos missing issue/PR templates
- Security Compliance: Ensure all repositories have
SECURITY.md
files - Documentation: Verify all projects have proper
README.md
anddocs/
directories
- Batch Updates: Use the script to identify repositories needing standardization
- Regular Sync: Run the script regularly to keep local copies up to date
- Clean Working Directory: Ensure local repositories have no uncommitted changes before running updates
- Onboarding: Include sanity check results in new developer onboarding
- Compliance: Track organization-wide compliance with repository standards
- Large Organizations: For organizations with 1000+ repositories, consider running in smaller batches
- Private Repository Access: Requires appropriate GitHub permissions
- Storage Space: Cloning many repositories requires significant disk space
- Network Usage: Initial cloning can consume significant bandwidth
"Organization not found"
- Verify the organization name in
config.env
- Check if the organization exists and is accessible
"No repositories found"
- Organization may have only private repositories (authenticate with
gh auth login
) - Organization name may be incorrect
"Permission denied"
- SSH key not configured properly
- Use
gh auth login
for authentication - Check repository access permissions
"Rate limit exceeded"
- Authenticate with GitHub CLI:
gh auth login
- Wait for rate limit reset (shown in error message)
"Failed to update repository"
- Repository may have uncommitted local changes
- Check for merge conflicts:
cd repo_directory && git status
- Reset local changes if safe:
git reset --hard origin/main
- May indicate network connectivity issues
"LICENSE shows warning (⚠) symbol"
- LICENSE file contains template placeholders like
<year>
or<name of author>
- Edit the LICENSE file to replace placeholders with actual values
- Common placeholders:
<year>
→ actual year,<name of author>
→ your name/organization
"LICENSE shows missing (✗) but file exists"
- LICENSE file may have an unexpected name or extension
- Supported names:
LICENSE
,LICENSE.txt
,LICENSE.md
,license
,License
,COPYING
,COPYRIGHT
- Check file permissions (must be readable)
- Verify file is not a directory or symlink
This project is licensed under the MIT License - see the LICENSE file for details.
- Issues: Report bugs and request features via GitHub Issues
- Discussions: Join conversations in GitHub Discussions
- Documentation: Check this README and inline script comments
Made with care for better repository management