shcherbak-ai
diff --git a/‎.github/workflows/docs.yml‎
Lines changed: 3 additions & 3 deletions b/‎.github/workflows/docs.yml‎
Lines changed: 3 additions & 3 deletions
diff --git a/‎CONTRIBUTING.md‎
Lines changed: 27 additions & 4 deletions b/‎CONTRIBUTING.md‎
Lines changed: 27 additions & 4 deletions
diff --git a/‎NOTICE‎
Lines changed: 1 addition & 0 deletions b/‎NOTICE‎
Lines changed: 1 addition & 0 deletions
diff --git a/‎README.md‎
Lines changed: 34 additions & 32 deletions b/‎README.md‎
Lines changed: 34 additions & 32 deletions
diff --git a/‎contextgem/internal/base/llms.py‎
Lines changed: 4 additions & 4 deletions b/‎contextgem/internal/base/llms.py‎
Lines changed: 4 additions & 4 deletions
@@ -40,15 +40,15 @@ jobs:
       - name: Build documentation
         run: |
           cd docs
-          uv run sphinx-build -b html source _build/html -v -E -W
+          uv run sphinx-build -b dirhtml source _build/dirhtml -v -E -W
       
       - name: Create .nojekyll file
-        run: touch docs/_build/html/.nojekyll
+        run: touch docs/_build/dirhtml/.nojekyll
 
       - name: Upload artifact
         uses: actions/upload-pages-artifact@v3
         with:
-          path: ./docs/_build/html
+          path: ./docs/_build/dirhtml
 
   deploy:
     environment:
 
@@ -484,19 +484,42 @@ The log output will show detailed information about test execution.
 
 ### 🏗️ Building the Documentation
 
-Navigate to the `docs/` directory and run:
+Navigate to the `docs/` directory and choose your preferred build method:
+
+#### For Live Development (Recommended)
+
+Use `sphinx-autobuild` for live reloading during development:
+
+```bash
+# Live rebuild with auto-refresh on file changes
+make livehtml
+# Or on Windows: ./make.bat livehtml
+```
+
+This starts a development server on `http://localhost:9000` with:
+- Automatic rebuilds when files change
+- Browser auto-refresh
+- Pretty URLs without `.html` extensions
+
+#### For Static Builds
+
+For one-time builds or CI-style building:
 
 ```bash
 # Build with verbose output, ignore cache, and treat warnings as errors 
 # (recommended for structural changes)
-uv run sphinx-build -b html source build/html -v -E -W
+uv run sphinx-build -b dirhtml source build/dirhtml -v -E -W
 ```
 
-The `-E` flag ensures Sphinx completely rebuilds the environment, which is especially important after structural changes like modifying toctree directives or removing files.
+The `-E` flag ensures Sphinx completely rebuilds the environment, which is especially important after structural changes like modifying toctree directives or removing files. The `dirhtml` format creates pretty URLs without `.html` extensions, consistent with the live development server.
 
 ### 👀 Viewing the Documentation
 
-After building, open `build/html/index.html` in your web browser to view the documentation.
+**With Live Development:**
+The documentation automatically opens at `http://localhost:9000` when using `make livehtml`.
+
+**With Static Builds:**
+After building, open `build/dirhtml/index.html` in your web browser to view the documentation.
 
 ### 🌐 Live Documentation
 
 
@@ -54,6 +54,7 @@ Development Dependencies:
 - python-dotenv: Environment variable management
 - ruff: Fast Python linter and formatter
 - sphinx: Documentation generator
+- sphinx-autobuild: Live-reloading docs builder for Sphinx 
 - sphinx-autodoc-typehints: Type annotation support for Sphinx
 - sphinx-book-theme: Book-like theme for Sphinx
 - sphinx-copybutton: Adds copy button to code blocks in Sphinx docs
 
@@ -25,7 +25,7 @@ Most popular LLM frameworks for extracting structured data from documents requir
 
 ContextGem addresses this challenge by providing a flexible, intuitive framework that extracts structured data and insights from documents with minimal effort. The complex, most time-consuming parts are handled with **powerful abstractions**, eliminating boilerplate code and reducing development overhead.
 
-📖 Read more on the project [motivation](https://contextgem.dev/motivation.html) in the documentation.
+📖 Read more on the project [motivation](https://contextgem.dev/motivation/) in the documentation.
 
 
 ## ⭐ Key features
@@ -151,15 +151,15 @@ ContextGem addresses this challenge by providing a flexible, intuitive framework
 🟡 - partially supported - requires additional setup<br>
 ◯ - not supported - requires custom logic
 
-\* See [descriptions](https://contextgem.dev/motivation.html#the-contextgem-solution) of ContextGem abstractions and [comparisons](https://contextgem.dev/vs_other_frameworks.html) of specific implementation examples using ContextGem and other popular open-source LLM frameworks.
+\* See [descriptions](https://contextgem.dev/motivation/#the-contextgem-solution) of ContextGem abstractions and [comparisons](https://contextgem.dev/vs_other_frameworks/) of specific implementation examples using ContextGem and other popular open-source LLM frameworks.
 
 ## 💡 What you can build
 
 With **minimal code**, you can:
 
 - **Extract structured data** from documents (text, images)
-- **Identify and analyze key aspects** (topics, themes, categories) within documents ([learn more](https://contextgem.dev/aspects/aspects.html))
-- **Extract specific concepts** (entities, facts, conclusions, assessments) from documents ([learn more](https://contextgem.dev/concepts/supported_concepts.html))
+- **Identify and analyze key aspects** (topics, themes, categories) within documents ([learn more](https://contextgem.dev/aspects/aspects/))
+- **Extract specific concepts** (entities, facts, conclusions, assessments) from documents ([learn more](https://contextgem.dev/concepts/supported_concepts/))
 - **Build complex extraction workflows** through a simple, intuitive API
 - **Create multi-level extraction pipelines** (aspects containing concepts, hierarchical aspects)
 
@@ -263,7 +263,7 @@ for item in anomalies_concept.extracted_items:
 </thead>
 <tbody>
 <tr>
-<td>Create a Document that contains text and/or visual content representing your document (contract, invoice, report, CV, etc.), from which an LLM extracts information (aspects and/or concepts). <a href="https://contextgem.dev/documents/document_config.html">Learn more</a></td>
+<td>Create a Document that contains text and/or visual content representing your document (contract, invoice, report, CV, etc.), from which an LLM extracts information (aspects and/or concepts). <a href="https://contextgem.dev/documents/document_config/">Learn more</a></td>
 </tr>
 </tbody>
 </table>
@@ -283,8 +283,8 @@ document = Document(raw_text="Non-Disclosure Agreement...")
 </thead>
 <tbody>
 <tr>
-<td>Define Aspects to extract text segments from the document (sections, topics, themes). You can organize content hierarchically and combine with concepts for comprehensive analysis. <a href="https://contextgem.dev/aspects/aspects.html">Learn more</a></td>
-<td>Define Concepts to extract specific data points with intelligent inference: entities, insights, structured objects, classifications, numerical calculations, dates, ratings, and assessments. <a href="https://contextgem.dev/concepts/supported_concepts.html">Learn more</a></td>
+<td>Define Aspects to extract text segments from the document (sections, topics, themes). You can organize content hierarchically and combine with concepts for comprehensive analysis. <a href="https://contextgem.dev/aspects/aspects/">Learn more</a></td>
+<td>Define Concepts to extract specific data points with intelligent inference: entities, insights, structured objects, classifications, numerical calculations, dates, ratings, and assessments. <a href="https://contextgem.dev/concepts/supported_concepts/">Learn more</a></td>
 </tr>
 </tbody>
 </table>
@@ -313,7 +313,7 @@ document.add_concepts([concept])
 </thead>
 <tbody>
 <tr>
-<td>Create a reusable collection of predefined aspects and concepts that enables consistent extraction across multiple documents. <a href="https://contextgem.dev/pipelines/extraction_pipelines.html">Learn more</a></td>
+<td>Create a reusable collection of predefined aspects and concepts that enables consistent extraction across multiple documents. <a href="https://contextgem.dev/pipelines/extraction_pipelines/">Learn more</a></td>
 </tr>
 </tbody>
 </table>
@@ -329,8 +329,8 @@ document.add_concepts([concept])
 </thead>
 <tbody>
 <tr>
-<td>Configure a cloud or local LLM that will extract aspects and/or concepts from the document. DocumentLLM supports fallback models and role-based task routing for optimal performance. <a href="https://contextgem.dev/llms/llm_extraction_methods.html">Learn more</a></td>
-<td>Configure a group of LLMs with unique roles for complex extraction workflows. You can route different aspects and/or concepts to specialized LLMs (e.g., simple extraction vs. reasoning tasks). <a href="https://contextgem.dev/llms/llm_config.html#llm-groups">Learn more</a></td>
+<td>Configure a cloud or local LLM that will extract aspects and/or concepts from the document. DocumentLLM supports fallback models and role-based task routing for optimal performance. <a href="https://contextgem.dev/llms/llm_extraction_methods/">Learn more</a></td>
+<td>Configure a group of LLMs with unique roles for complex extraction workflows. You can route different aspects and/or concepts to specialized LLMs (e.g., simple extraction vs. reasoning tasks). <a href="https://contextgem.dev/llms/llm_config/#llm-groups">Learn more</a></td>
 </tr>
 </tbody>
 </table>
@@ -345,22 +345,22 @@ document = llm.extract_all(document)
 # print(document.concepts[0].extracted_items)
 ```
 
-📖 Learn more about ContextGem's [core components](https://contextgem.dev/how_it_works.html) and their practical examples in the documentation.
+📖 Learn more about ContextGem's [core components](https://contextgem.dev/how_it_works/) and their practical examples in the documentation.
 
 ## 📚 Usage Examples
 
 🌟 **Basic usage:**
-- [Aspect Extraction from Document](https://contextgem.dev/quickstart.html#aspect-extraction-from-document)
-- [Extracting Aspect with Sub-Aspects](https://contextgem.dev/quickstart.html#extracting-aspect-with-sub-aspects)
-- [Concept Extraction from Aspect](https://contextgem.dev/quickstart.html#concept-extraction-from-aspect)
-- [Concept Extraction from Document (text)](https://contextgem.dev/quickstart.html#concept-extraction-from-document-text)
-- [Concept Extraction from Document (vision)](https://contextgem.dev/quickstart.html#concept-extraction-from-document-vision)
-- [LLM chat interface](https://contextgem.dev/quickstart.html#lightweight-llm-chat-interface)
+- [Aspect Extraction from Document](https://contextgem.dev/quickstart/#aspect-extraction-from-document)
+- [Extracting Aspect with Sub-Aspects](https://contextgem.dev/quickstart/#extracting-aspect-with-sub-aspects)
+- [Concept Extraction from Aspect](https://contextgem.dev/quickstart/#concept-extraction-from-aspect)
+- [Concept Extraction from Document (text)](https://contextgem.dev/quickstart/#concept-extraction-from-document-text)
+- [Concept Extraction from Document (vision)](https://contextgem.dev/quickstart/#concept-extraction-from-document-vision)
+- [LLM chat interface](https://contextgem.dev/quickstart/#lightweight-llm-chat-interface)
 
 🚀 **Advanced usage:**
-- [Extracting Aspects Containing Concepts](https://contextgem.dev/advanced_usage.html#extracting-aspects-with-concepts)
-- [Extracting Aspects and Concepts from a Document](https://contextgem.dev/advanced_usage.html#extracting-aspects-and-concepts-from-a-document)
-- [Using a Multi-LLM Pipeline to Extract Data from Several Documents](https://contextgem.dev/advanced_usage.html#using-a-multi-llm-pipeline-to-extract-data-from-several-documents)
+- [Extracting Aspects Containing Concepts](https://contextgem.dev/advanced_usage/#extracting-aspects-with-concepts)
+- [Extracting Aspects and Concepts from a Document](https://contextgem.dev/advanced_usage/#extracting-aspects-and-concepts-from-a-document)
+- [Using a Multi-LLM Pipeline to Extract Data from Several Documents](https://contextgem.dev/advanced_usage/#using-a-multi-llm-pipeline-to-extract-data-from-several-documents)
 
 
 ## 🔄 Document converters
@@ -405,14 +405,14 @@ docx_text = converter.convert_to_text_format(
 
 ```
 
-📖 Learn more about [DOCX converter features](https://contextgem.dev/converters/docx.html) in the documentation.
+📖 Learn more about [DOCX converter features](https://contextgem.dev/converters/docx/) in the documentation.
 
 
 ## 🎯 Focused document analysis
 
 ContextGem leverages LLMs' long context windows to deliver superior extraction accuracy from individual documents. Unlike RAG approaches that often [struggle with complex concepts and nuanced insights](https://www.linkedin.com/pulse/raging-contracts-pitfalls-rag-contract-review-shcherbak-ai-ptg3f), ContextGem capitalizes on continuously expanding context capacity, evolving LLM capabilities, and decreasing costs. This focused approach enables direct information extraction from complete documents, eliminating retrieval inconsistencies while optimizing for in-depth single-document analysis. While this delivers higher accuracy for individual documents, ContextGem does not currently support cross-document querying or corpus-wide retrieval - for these use cases, modern RAG frameworks (e.g., LlamaIndex, Haystack) remain more appropriate.
 
-📖 Read more on [how ContextGem works](https://contextgem.dev/how_it_works.html) in the documentation.
+📖 Read more on [how ContextGem works](https://contextgem.dev/how_it_works/) in the documentation.
 
 ## 🤖 Supported LLMs
 
@@ -422,20 +422,20 @@ ContextGem supports both cloud-based and local LLMs through [LiteLLM](https://gi
 - **Model Architectures**: Works with both reasoning/CoT-capable (e.g. gpt-5) and non-reasoning models (e.g. gpt-4.1)
 - **Simple API**: Unified interface for all LLMs with easy provider switching
 
-> **💡 Model Selection Note:** For reliable structured extraction, we recommend using models with performance equivalent to or exceeding `gpt-4o-mini`. Smaller models (such as 8B parameter models) may struggle with ContextGem's detailed extraction instructions. If you encounter issues with smaller models, see our [troubleshooting guide](https://contextgem.dev/optimizations/optimization_small_llm_troubleshooting.html) for potential solutions.
+> **💡 Model Selection Note:** For reliable structured extraction, we recommend using models with performance equivalent to or exceeding `gpt-4o-mini`. Smaller models (such as 8B parameter models) may struggle with ContextGem's detailed extraction instructions. If you encounter issues with smaller models, see our [troubleshooting guide](https://contextgem.dev/optimizations/optimization_small_llm_troubleshooting/) for potential solutions.
 
-📖 Learn more about [supported LLM providers and models](https://contextgem.dev/llms/supported_llms.html), how to [configure LLMs](https://contextgem.dev/llms/llm_config.html), and [LLM extraction methods](https://contextgem.dev/llms/llm_extraction_methods.html) in the documentation.
+📖 Learn more about [supported LLM providers and models](https://contextgem.dev/llms/supported_llms/), how to [configure LLMs](https://contextgem.dev/llms/llm_config/), and [LLM extraction methods](https://contextgem.dev/llms/llm_extraction_methods/) in the documentation.
 
 ## ⚡ Optimizations
 
 ContextGem documentation offers guidance on optimization strategies to maximize performance, minimize costs, and enhance extraction accuracy:
 
-- [Optimizing for Accuracy](https://contextgem.dev/optimizations/optimization_accuracy.html)
-- [Optimizing for Speed](https://contextgem.dev/optimizations/optimization_speed.html)
-- [Optimizing for Cost](https://contextgem.dev/optimizations/optimization_cost.html)
-- [Dealing with Long Documents](https://contextgem.dev/optimizations/optimization_long_docs.html)
-- [Choosing the Right LLM(s)](https://contextgem.dev/optimizations/optimization_choosing_llm.html)
-- [Troubleshooting Issues with Small Models](https://contextgem.dev/optimizations/optimization_small_llm_troubleshooting.html)
+- [Optimizing for Accuracy](https://contextgem.dev/optimizations/optimization_accuracy/)
+- [Optimizing for Speed](https://contextgem.dev/optimizations/optimization_speed/)
+- [Optimizing for Cost](https://contextgem.dev/optimizations/optimization_cost/)
+- [Dealing with Long Documents](https://contextgem.dev/optimizations/optimization_long_docs/)
+- [Choosing the Right LLM(s)](https://contextgem.dev/optimizations/optimization_choosing_llm/)
+- [Troubleshooting Issues with Small Models](https://contextgem.dev/optimizations/optimization_small_llm_troubleshooting/)
 
 
 ## 💾 Serializing results
@@ -446,14 +446,16 @@ ContextGem allows you to save and load Document objects, pipelines, and LLM conf
 - Transfer extraction results between systems
 - Persist pipeline and LLM configurations for later reuse
 
-📖 Learn more about [serialization options](https://contextgem.dev/serialization.html) in the documentation.
+📖 Learn more about [serialization options](https://contextgem.dev/serialization/) in the documentation.
 
 
 ## 📚 Documentation
 
 📖 **Full documentation:** [contextgem.dev](https://contextgem.dev)
 
-📄 **Raw documentation for LLMs:** Available at [`docs/docs-raw-for-llm.txt`](https://github.com/shcherbak-ai/contextgem/blob/main/docs/docs-raw-for-llm.txt) - automatically generated, optimized for LLM ingestion.
+> **⚠️ Official Documentation Notice:** [https://contextgem.dev/](https://contextgem.dev/) is the only official source of ContextGem documentation. Please be aware of unauthorized copies or mirrors that may contain outdated or incorrect information.
+
+📄 **Raw documentation for LLMs:** Available at [`docs/source/llms.txt`](https://github.com/shcherbak-ai/contextgem/blob/main/docs/source/llms.txt) - automatically generated, optimized for LLM ingestion.
 
 🤖 **AI-powered code exploration:** [DeepWiki](https://deepwiki.com/shcherbak-ai/contextgem) provides visual architecture maps and natural language Q&A for the codebase.
 
 
@@ -3186,7 +3186,7 @@ def _post_init(self, __context: Any):
             logger.info(
                 "Using local model provider. If you experience issues like JSON validation errors "
                 "with smaller models, see our troubleshooting guide: "
-                "https://contextgem.dev/optimizations/optimization_small_llm_troubleshooting.html"
+                "https://contextgem.dev/optimizations/optimization_small_llm_troubleshooting/"
             )
 
         # Recommend `ollama_chat` prefix for better responses for Ollama models (text-only processing)
@@ -4008,7 +4008,7 @@ def _validate_document_llm_post(self) -> Self:
                 f"while the model is reasoning-capable. If you intend to route reasoning tasks "
                 f"to this model, consider using a `reasoner_*` role to match aspect/concept `llm_role` "
                 f"and keep pipeline roles consistent. See "
-                f"https://contextgem.dev/optimizations/optimization_choosing_llm.html",
+                f"https://contextgem.dev/optimizations/optimization_choosing_llm/",
                 stacklevel=2,
             )
 
@@ -4086,7 +4086,7 @@ def _validate_input_tokens(self, messages: list[dict[str, str]]) -> None:
                     f"(for text) or `max_images_to_analyze_per_call` (for images) to process the "
                     f"document in smaller chunks. "
                     f"See the optimization guide for long documents: "
-                    f"https://contextgem.dev/optimizations/optimization_long_docs.html"
+                    f"https://contextgem.dev/optimizations/optimization_long_docs/"
                 )
 
             logger.debug(
@@ -4153,7 +4153,7 @@ def _validate_output_tokens(self) -> None:
                     f"(for text) or `max_images_to_analyze_per_call` (for images) to process the "
                     f"document in smaller chunks. "
                     f"See the optimization guide for long documents: "
-                    f"https://contextgem.dev/optimizations/optimization_long_docs.html"
+                    f"https://contextgem.dev/optimizations/optimization_long_docs/"
                 )
 
             logger.debug(