CLI tool to create knowledge base on open source docs using MindsDB Knowledge Base and Ollama and serve to agent with MCP (Model Context Protocol) server.
- π Intelligent Document Discovery: Automatically finds and processes Markdown files (.md, .mdx) from GitHub repositories
- π€ Local AI-Powered Search: Uses Ollama with nomic-embed-text embeddings and gemma2 reranking for privacy-focused AI
- π¬ Interactive Chat Interface: Query your documentation with natural language
- π Smart Sync: Efficiently syncs repository changes by comparing file SHAs
- π Repository Management: Track multiple repositories with detailed metadata
- ποΈ Local Database: SQLite-based storage for repository metadata and file tracking
- π MCP Server: Model Context Protocol server for integration with AI assistants like Claude Desktop
- π§ GitHub Integration: Create GitHub clients directly in MindsDB for enhanced repository access
Install and run Ollama locally:
# Install required models
ollama pull nomic-embed-text
ollama pull gemma2
# Ensure Ollama is running on http://localhost:11434
ollama serve
Install and run MindsDB docs.
For private repositories or to avoid rate limits:
export GITHUB_TOKEN=your_github_token
# Clone the repository
git clone <your-repo-url>
cd docs-kb
# Install with uv
uv pip install -e .
pip install -e .
# Ingest documentation from a GitHub repository
docs-kb ingest owner/repository-name
# Ingest from a specific branch
docs-kb ingest owner/repository-name --branch develop
# Create GitHub client in MindsDB during ingestion
docs-kb ingest owner/repository-name --mindsdb-github-client
# Start interactive chat with your documentation
docs-kb query
# View all ingested repositories
docs-kb list
# Sync, delete, or manage repositories
docs-kb manage
# Start MCP server for AI assistant integration
docs-kb start-mcp-server
# Start on custom host and port
docs-kb start-mcp-server --host 0.0.0.0 --port 8080
Ingest files from a GitHub repository into the knowledge base.
docs-kb ingest REPO_NAME [OPTIONS]
Arguments:
REPO_NAME GitHub repository name (e.g., 'owner/repo') [required]
Options:
-b, --branch TEXT Branch to ingest from [default: main]
-m, --mindsdb-github-client Create GitHub client in MindsDB server
--help Show this message and exit
Examples:
# Ingest from main branch
docs-kb ingest microsoft/vscode
# Ingest from specific branch
docs-kb ingest facebook/react --branch canary
# Ingest with GitHub client creation in MindsDB
docs-kb ingest microsoft/vscode --mindsdb-github-client
# Ingest with GitHub token for private repos
GITHUB_TOKEN=your_token docs-kb ingest private-org/private-repo
Interactive chat interface to query your documentation.
docs-kb query [OPTIONS]
Options:
--help Show this message and exit
Usage:
- Select a repository from the list
- Ask questions in natural language
- Type 'exit', 'quit', or press Ctrl+C to end the session
Example Session:
π£οΈ You: How do I set up authentication?
π€ Bot: Based on the documentation, here's how to set up authentication...
π£οΈ You: What are the available configuration options?
π€ Bot: The available configuration options include...
Display all ingested repositories with their metadata.
docs-kb list [OPTIONS]
Options:
--help Show this message and exit
Output includes:
- Repository name and branch
- Knowledge base name
- Number of files
- Creation date
- Last sync date
Manage existing repositories with options to sync or delete.
docs-kb manage [OPTIONS]
Options:
--help Show this message and exit
Available Actions:
- π Sync: Update repository with latest changes from GitHub
- ποΈ Delete: Remove repository and its knowledge base permanently
Start the Model Context Protocol (MCP) server for AI assistant integration.
docs-kb start-mcp-server [OPTIONS]
Options:
-h, --host TEXT Host to bind to [default: localhost]
-p, --port INTEGER Port to bind to [default: 8000]
--help Show this message and exit
Examples:
# Start server on default localhost:8000
docs-kb start-mcp-server
# Start server on custom host and port
docs-kb start-mcp-server --host 0.0.0.0 --port 8080
# Start server accessible from other machines
docs-kb start-mcp-server --host 0.0.0.0
Show version and project information.
docs-kb version [OPTIONS]
Options:
--help Show this message and exit
The docs-kb MCP server provides AI assistants like Claude Desktop with powerful tools to interact with your documentation knowledge base and GitHub repositories.
Get comprehensive usage guide for the MCP server.
List all ingested repositories in the knowledge base.
Returns:
- Repository ID, name, branch
- Knowledge base name
- Creation and ingestion timestamps
- File count
Query a repository's knowledge base with natural language.
Parameters:
repo_name
: Repository name (e.g., 'owner/repo')branch
: Repository branchquery
: Natural language search querylimit
: Maximum results to return
Returns:
- Search results with content and metadata
- Relevance scores and source information
- Total results count
Browse GitHub repository file structure with filtering.
Parameters:
repo_name
: Repository namebranch
: Repository branch (default: 'main')file_extensions
: File types to include (e.g., ['.md', '.py'])path_filter
: Path prefix filter (e.g., 'docs/')
Returns:
- Filtered list of files
- Total file count
- Applied filters summary
Retrieve content of a specific file from GitHub.
Parameters:
repo_name
: Repository namefile_path
: Path to the filebranch
: Repository branch
Returns:
- File content and metadata
- File information (size, SHA, etc.)
- Encoding details
Load multiple files from GitHub repository concurrently.
Parameters:
repo_name
: Repository namefile_paths
: List of file paths to loadbranch
: Repository branchmax_concurrent
: Maximum concurrent requests
Returns:
- Successfully loaded files
- Failed file paths
- Loading statistics
-
Start the MCP Server:
docs-kb start-mcp-server
-
Configure MCP Client
- Use the MCP client in your AI assistant
- Connect to the server at
http://localhost:8000
(or custom host/port)
-
Use with AI Assistants:
- The server provides tools for querying ingested documentation
- Browse and access any GitHub repository files
- Perform natural language searches across documentation
The sync command intelligently compares your local knowledge base with the latest GitHub repository state:
- Discovery: Fetches current file list from GitHub
- Comparison: Compares file SHAs to detect changes
- Classification: Categorizes files as new, modified, deleted, or unchanged
- Selective Update: Only processes files that have actually changed
- Knowledge Base Update: Updates embeddings for changed content
- Metadata Update: Updates local database with new file information
Change Types:
- π New Files: Added to knowledge base
- π Modified Files: Content updated in knowledge base
- ποΈ Deleted Files: Removed from knowledge base
- β Unchanged Files: No action needed
- Location:
~/.docs-kb/docs_kb.db
(SQLite) - Contents: Repository metadata, file tracking, sync history
- Schema: Defined in
src/docs_kb/core/models.py
- Platform: MindsDB
- Embeddings: Ollama nomic-embed-text model
- Reranking: Ollama gemma2 model
- Content: Document chunks with metadata (repository, branch, path, SHA, etc.)
- MindsDB GitHub Clients: Optional GitHub database connections in MindsDB
- Purpose: Enhanced repository access and integration
- Creation: Use
--mindsdb-github-client
flag during ingestion
src/docs_kb/core/file_loader.py
: GitHub file discovery and loadingsrc/docs_kb/core/mindsdb_client.py
: MindsDB knowledge base operationssrc/docs_kb/core/models.py
: Database models and operationssrc/docs_kb/cli.py
: Command-line interface
src/docs_kb/commands/ingest.py
: Repository ingestion logicsrc/docs_kb/commands/query.py
: Interactive chat interfacesrc/docs_kb/commands/sync.py
: Smart synchronizationsrc/docs_kb/commands/manage.py
: Repository managementsrc/docs_kb/commands/list.py
: Repository listingsrc/docs_kb/commands/start_mcp.py
: MCP server startup
src/docs_kb/mcp_server/server.py
: FastMCP server implementation- Protocol: Model Context Protocol for AI assistant integration
- Tools: 6 comprehensive tools for documentation and repository access
# GitHub token for private repositories or higher rate limits
export GITHUB_TOKEN=your_github_personal_access_token
Ensure these models are available:
ollama list
# Should show:
# nomic-embed-text:latest
# gemma2:latest
When using the --mindsdb-github-client
flag, docs-kb creates a GitHub database connection in MindsDB:
CREATE DATABASE github_client
WITH ENGINE = 'github'
PARAMETERS = {
repository = 'owner/repo',
branch = 'main',
"api_key": 'your_github_token'
};
This enables MindsDB MCP for advance repository analysis and querying when combined with docs-kb-mcp.
- Python 3.13+
- Ollama with nomic-embed-text and gemma2 models
- MindsDB (local or cloud)
- Optional: GitHub Personal Access Token
- For MCP: Compatible AI assistant (Claude Desktop, etc.)
MIT
For issues and questions:
- Check the troubleshooting section above
- Search existing issues
- Create a new issue with detailed information
Built with β€οΈ using MindsDB, Ollama, FastMCP, and Python