Local AI URL Reader Assistant - Ollama

Overview

URL Reader Assistant is a powerful documentation analysis tool that combines web crawling capabilities with local Large Language Models (LLMs) through Ollama. The system efficiently processes web content and provides an interactive question-answering interface using Retrieval-Augmented Generation (RAG) technology.

The assistant specializes in crawling documentation websites, processing their content through local language models, and creating an intelligent knowledge base that users can query. By leveraging local LLM processing, it offers both privacy and cost-effectiveness while maintaining high-quality responses with source citations.

Core Features

Content Processing

Multi-threaded web crawling for efficient content gathering
Intelligent URL filtering and domain-specific content extraction
Automated content chunking and optimization
Vector database storage for efficient retrieval

AI Capabilities

Local LLM processing using Ollama
Context-aware query processing
Source-cited responses
Conversation memory management
Interactive Q&A interface

System Features

Database inspection and management tools
Configurable crawling parameters
Command history navigation
Automatic cleanup and session management

Installation

Prerequisites

Python 3.8 or higher
Ollama installed and running locally
Git for repository cloning
Virtual environment (recommended)

Setup Process

# Clone the repository
git clone https://github.com/AIAfterDark/AI-URL-Read.git
cd url-reader

# Create and activate virtual environment
python -m venv venv
source venv/bin/activate  # On Windows use: venv\Scripts\activate

# Install required packages
pip install -r requirements.txt

# Ensure Ollama is running and pull required model
ollama pull llama3.2

Required Dependencies

Create a requirements.txt file containing:

langchain
langchain-community
langchain-ollama
beautifulsoup4
requests
chromadb
colorama

Usage

Basic Operation

The most straightforward way to use the URL Reader Assistant is:

python url-read.py https://example.com

Advanced Configuration

For more control over the processing:

python url-read.py https://example.com \
    --model llama3.2 \
    --max-pages 5 \
    --verbose \
    --save-db \
    --memory-size 5

Command Line Arguments

Argument	Description	Default
url	Target URL to analyze (required)	None
--model	Ollama model name	llama3.2
--max-pages	Maximum pages to crawl	50
--verbose	Enable detailed logging	False
--save-db	Save database snapshot	False
--memory-size	Recent interactions to remember	5

Interactive Commands

During the Q&A session, the following commands are available:

/quit - Exit the application
/db info - Display database information
/db inspect <id> - Inspect specific document chunk
/db save [filename] - Save database snapshot

Use arrow keys (↑↓) for command history navigation.

Example Session

$ python url-read.py https://docs.example.com

Database Information:
Total documents: 25
Database path: ./chroma_db

Article Overview:
[Generated content overview]

Documentation processed! Enter your questions (/quit to exit)

Question: What are the main features?

Answer:
[AI-generated response]

Sources:
- Documentation Home
  docs.example.com/home
- Features Page
  docs.example.com/features

Best Practices

Content Processing

Start with a small number of pages for initial testing
Enable verbose mode when debugging issues
Use database snapshots for important content
Verify source citations in responses

Model Selection

Choose appropriate models based on content type
Consider memory requirements for larger sites
Balance between speed and accuracy needs

Query Optimization

Ask specific, focused questions
Utilize conversation context for follow-ups
Review source citations for verification

Known Limitations

Currently supports HTML content only
Single domain processing per session
Requires active Ollama installation
Memory usage scales with content size

Troubleshooting

Common Issues

Database Connection Errors
- Verify ChromaDB installation
- Check directory permissions
- Ensure sufficient disk space
Ollama Connection Issues
- Confirm Ollama is running
- Verify model availability
- Check network connectivity
Memory Problems
- Reduce max pages parameter
- Adjust chunk sizes
- Increase available system memory

Contributing

We welcome contributions to the URL Reader Assistant project. Please follow these steps:

Fork the repository
Create a feature branch
Commit your changes
Push to your branch
Create a Pull Request

For detailed contribution guidelines, see CONTRIBUTING.md.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments

LangChain team for the fundamental framework
Ollama project for local LLM capabilities
ChromaDB for vector storage solutions

Contact

For issues and feature requests, please use the GitHub issue tracker.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
LICENSE.md		LICENSE.md
README.md		README.md
Technical-Docs.md		Technical-Docs.md
requirements.txt		requirements.txt
url-read.py		url-read.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Local AI URL Reader Assistant - Ollama

Overview

Core Features

Content Processing

AI Capabilities

System Features

Installation

Prerequisites

Setup Process

Required Dependencies

Usage

Basic Operation

Advanced Configuration

Command Line Arguments

Interactive Commands

Example Session

Best Practices

Content Processing

Model Selection

Query Optimization

Known Limitations

Troubleshooting

Common Issues

Contributing

License

Acknowledgments

Contact

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

AIAfterDark/AI-URL-Read

Folders and files

Latest commit

History

Repository files navigation

Local AI URL Reader Assistant - Ollama

Overview

Core Features

Content Processing

AI Capabilities

System Features

Installation

Prerequisites

Setup Process

Required Dependencies

Usage

Basic Operation

Advanced Configuration

Command Line Arguments

Interactive Commands

Example Session

Best Practices

Content Processing

Model Selection

Query Optimization

Known Limitations

Troubleshooting

Common Issues

Contributing

License

Acknowledgments

Contact

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages