shcherbak-ai
diff --git a/‎.gitignore
Lines changed: 2 additions & 0 deletions b/‎.gitignore
Lines changed: 2 additions & 0 deletions
diff --git a/‎.pre-commit-config.yaml
Lines changed: 16 additions & 0 deletions b/‎.pre-commit-config.yaml
Lines changed: 16 additions & 0 deletions
diff --git a/‎CITATION.cff
Lines changed: 1 addition & 1 deletion b/‎CITATION.cff
Lines changed: 1 addition & 1 deletion
diff --git a/‎CONTRIBUTING.md
Lines changed: 18 additions & 3 deletions b/‎CONTRIBUTING.md
Lines changed: 18 additions & 3 deletions
diff --git a/‎NOTICE
Lines changed: 2 additions & 0 deletions b/‎NOTICE
Lines changed: 2 additions & 0 deletions
diff --git a/‎README.md
Lines changed: 84 additions & 8 deletions b/‎README.md
Lines changed: 84 additions & 8 deletions
diff --git a/‎contextgem/__init__.py
Lines changed: 1 addition & 1 deletion b/‎contextgem/__init__.py
Lines changed: 1 addition & 1 deletion
@@ -7,8 +7,10 @@ env
 venv
 .venv
 .coverage
+.cz.msg
 
 notebooks
+!dev/notebooks
 docs/build
 dist
 .DS_Store
@@ -1,4 +1,11 @@
 repos:
+
+  # Commitizen hook for conventional commits
+  - repo: https://github.com/commitizen-tools/commitizen
+    rev: v4.5.1
+    hooks:
+      - id: commitizen
+        stages: [commit-msg]
 
   # Custom local hooks
   - repo: local
@@ -58,3 +65,12 @@ repos:
         pass_filenames: false
         always_run: true
         stages: [pre-commit]
+
+      # Generate example notebooks
+      - id: generate-notebooks
+        name: Generate example notebooks
+        entry: python dev/generate_notebooks.py
+        language: system
+        pass_filenames: false
+        always_run: true
+        stages: [pre-commit]
@@ -5,5 +5,5 @@ authors:
     given-names: Sergii
     email: sergii@shcherbak.ai
 title: "ContextGem: Easier and faster way to build LLM extraction workflows through powerful abstractions"
-date-released: 2024-04-02
+date-released: 2025-04-02
 url: "https://github.com/shcherbak-ai/contextgem"
@@ -53,7 +53,11 @@ To sign the agreement:
 
 3. **Install pre-commit hooks**:
     ```bash
+    # Install pre-commit hooks
     pre-commit install
+
+    # Install commit-msg hooks (for commitizen)
+    pre-commit install --hook-type commit-msg
     ```
 
 
@@ -102,12 +106,23 @@ To sign the agreement:
 
    Please note that we use pytest-vcr to record and replay LLM API interactions. Your changes may require re-recording VCR cassettes for the tests. See [VCR Cassette Management](#vcr-cassette-management) section below for details.
 
-4. **Commit your changes** with a descriptive commit message:
+4. **Commit your changes** using Conventional Commits format:
 
-   For example:
+   We use [Conventional Commits](https://www.conventionalcommits.org/) format for our commit messages, which enables automatic changelog generation and semantic versioning. Instead of using regular git commit, please use commitizen:
 
    ```bash
-   git commit -m "Add feature: description of your changes"
+   poetry run cz commit
+   ```
+
+   This will guide you through an interactive prompt to create a properly formatted commit message with:
+   - Type of change (feat, fix, docs, style, refactor, etc.)
+   - Optional scope (e.g., api, cli, docs)
+   - Short description
+   - Optional longer description and breaking change notes
+
+   Example of resulting commit message:
+   ```
+   docs(readme): update installation instructions
    ```
 
 
 
@@ -35,8 +35,10 @@ Core Dependencies:
 
 Development Dependencies:
 - black: Code formatting
+- commitizen: Conventional commit tool and release management
 - coverage: Test coverage measurement
 - isort: Sorting imports
+- nbformat: Notebook format utilities
 - pip-tools: Dependency management
 - pre-commit: Pre-commit hooks
 - pytest: Testing framework
 
@@ -28,9 +28,8 @@ ContextGem addresses this challenge by providing a flexible, intuitive framework
 Read more on the project [motivation](https://contextgem.dev/motivation.html) in the documentation.
 
 
-## 💡 What can you do with ContextGem?
+## 💡 With ContextGem, you can:
 
-With ContextGem, you can:
 - **Extract structured data** from documents (text, images) with minimal code
 - **Identify and analyze key aspects** (topics, themes, categories) within documents
 - **Extract specific concepts** (entities, facts, conclusions, assessments) from documents
@@ -173,15 +172,74 @@ pip install -U contextgem
 
 ## 🚀 Quick start
 
+### Aspect extraction
+
+Aspect is a defined area or topic within a document (or another aspect). Each aspect reflects a specific subject or theme.
+
+```python
+# Quick Start Example - Extracting payment terms from a document
+
+import os
+
+from contextgem import Aspect, Document, DocumentLLM
+
+# Sample document text (shortened for brevity)
+doc = Document(
+    raw_text=(
+        "SERVICE AGREEMENT\n"
+        "SERVICES. Provider agrees to provide the following services to Client: "
+        "Cloud-based data analytics platform access and maintenance...\n"
+        "PAYMENT. Client agrees to pay $5,000 per month for the services. "
+        "Payment is due on the 1st of each month. Late payments will incur a 2% fee per month...\n"
+        "CONFIDENTIALITY. Both parties agree to keep all proprietary information confidential "
+        "for a period of 5 years following termination of this Agreement..."
+    ),
+)
+
+# Define the aspects to extract
+doc.aspects = [
+    Aspect(
+        name="Payment Terms",
+        description="Payment terms and conditions in the contract",
+        # see the docs for more configuration options, e.g. sub-aspects, concepts, etc.
+    ),
+    # Add more aspects as needed
+]
+# Or use `doc.add_aspects([...])`
+
+# Define an LLM for extracting information from the document
+llm = DocumentLLM(
+    model="openai/gpt-4o-mini",  # or any other LLM from e.g. Anthropic, etc.
+    api_key=os.environ.get(
+        "CONTEXTGEM_OPENAI_API_KEY"
+    ),  # your API key for the LLM provider, e.g. OpenAI, Anthropic, etc.
+    # see the docs for more configuration options
+)
+
+# Extract information from the document
+doc = llm.extract_all(doc)  # or use async version `await llm.extract_all_async(doc)`
+
+# Access extracted information in the document object
+for item in doc.aspects[0].extracted_items:
+    print(f"• {item.value}")
+# or `doc.get_aspect_by_name("Payment Terms").extracted_items`
+
+```
+[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/shcherbak-ai/contextgem/blob/main/dev/notebooks/readme/quickstart_aspect.ipynb)
+
+
+### Concept extraction
+
+Concept is a unit of information or an entity, derived from an aspect or the broader document context.
+
 ```python
 # Quick Start Example - Extracting anomalies from a document, with source references and justifications
 
 import os
 
 from contextgem import Document, DocumentLLM, StringConcept
 
-# Example document instance
-# Document content is shortened for brevity
+# Sample document text (shortened for brevity)
 doc = Document(
     raw_text=(
         "Consultancy Agreement\n"
@@ -203,13 +261,14 @@ doc.concepts = [
         reference_depth="sentences",
         add_justifications=True,
         justification_depth="brief",
+        # see the docs for more configuration options
     )
     # add more concepts to the document, if needed
     # see the docs for available concepts: StringConcept, JsonObjectConcept, etc.
 ]
-# Or use doc.add_concepts([...])
+# Or use `doc.add_concepts([...])`
 
-# Create an LLM for extracting data and insights from the document
+# Define an LLM for extracting information from the document
 llm = DocumentLLM(
     model="openai/gpt-4o-mini",  # or any other LLM from e.g. Anthropic, etc.
     api_key=os.environ.get(
@@ -219,15 +278,18 @@ llm = DocumentLLM(
 )
 
 # Extract information from the document
-doc = llm.extract_all(doc)  # or use async version llm.extract_all_async(doc)
+doc = llm.extract_all(doc)  # or use async version `await llm.extract_all_async(doc)`
 
 # Access extracted information in the document object
 print(
     doc.concepts[0].extracted_items
 )  # extracted items with references & justifications
-# or doc.get_concept_by_name("Anomalies").extracted_items
+# or `doc.get_concept_by_name("Anomalies").extracted_items`
 
 ```
+[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/shcherbak-ai/contextgem/blob/main/dev/notebooks/readme/quickstart_concept.ipynb)
+
+---
 
 See more examples in the documentation:
 
@@ -305,6 +367,20 @@ This project is automatically scanned for security vulnerabilities using [CodeQL
 See [SECURITY](https://github.com/shcherbak-ai/contextgem/blob/main/SECURITY.md) file for details.
 
 
+## 🙏 Acknowledgements
+
+ContextGem relies on these excellent open-source packages:
+
+- [pydantic](https://github.com/pydantic/pydantic): The gold standard for data validation
+- [Jinja2](https://github.com/pallets/jinja): Fast, expressive template engine that powers our dynamic prompt rendering
+- [litellm](https://github.com/BerriAI/litellm): Unified interface to multiple LLM providers with seamless provider switching
+- [wtpsplit](https://github.com/segment-any-text/wtpsplit): State-of-the-art text segmentation tool
+- [loguru](https://github.com/Delgan/loguru): Simple yet powerful logging that enhances debugging and observability
+- [python-ulid](https://github.com/mdomke/python-ulid): Efficient ULID generation
+- [PyTorch](https://github.com/pytorch/pytorch): Industry-standard machine learning framework
+- [aiolimiter](https://github.com/mjpieters/aiolimiter): Powerful rate limiting for async operations
+
+
 ## 📄 License & Contact
 
 This project is licensed under the Apache 2.0 License - see the [LICENSE](https://github.com/shcherbak-ai/contextgem/blob/main/LICENSE) and [NOTICE](https://github.com/shcherbak-ai/contextgem/blob/main/NOTICE) files for details.
 
@@ -20,7 +20,7 @@
 ContextGem - Easier and faster way to build LLM extraction workflows through powerful abstractions
 """
 
-__version__ = "0.1.1"
+__version__ = "0.1.1.post1"
 __author__ = "Shcherbak AI AS"
 
 from contextgem.public import (