HappyMatrix ECO Assistant

A GenAI-powered tool for analyzing Engineering Change Orders.

Author: Olga Seymour

Date: May 2025

GitHub: https://github.com/AI-Data-Space/happymatrix-eco-assistant

Project Overview

The ECO Assistant demonstrates how Generative AI can assist engineers and product teams in understanding and organizing Engineering Change Orders (ECOs). It uses Google's Gemini LLMs combined with Retrieval-Augmented Generation (RAG) to extract, analyze, and communicate information from unstructured ECO documents.

Why This Matters

Engineering Change Orders are critical documents in product development that are often:

Time-consuming to analyze manually
Inconsistent in format and structure
Difficult to integrate into downstream systems

This assistant demonstrates how GenAI can transform unstructured ECO documents into actionable insights and structured data, saving engineering teams valuable time and improving decision-making.

Features

Natural Language Q&A: Query documents using plain language
Structured Data Extraction: Convert unstructured text into JSON
Semantic Search: Find relevant ECO content based on meaning
Stakeholder Communication: Auto-generate email summaries
Few-Shot Learning: Improve extraction with examples
Batch Processing: Process multiple ECOs at once

Technologies

Google Gemini 1.5 Flash - Large Language Model
Retrieval Augmented Generation (RAG) - Core technique for document Q&A
LangChain - Orchestration framework
ChromaDB - Vector database for semantic search
Python - Implementation language

Technical Architecture

The ECO Assistant uses a multi-stage pipeline:

Document Processing: ECO documents are loaded, tagged with their identifiers, and split into chunks
Vector Embedding: Text chunks are converted to vector embeddings using Gemini's embedding model
Semantic Search: ChromaDB enables retrieval of the most relevant document chunks for each query
Context-Enhanced Generation: Retrieved context is sent to Gemini along with the query and few-shot examples
Format Control: Outputs are processed into either natural language or structured JSON based on user preference

This Retrieval-Augmented Generation (RAG) approach grounds all responses in the actual ECO document content, ensuring accuracy while leveraging Gemini's language capabilities.

Project Structure

happymatrix-eco-assistant/
├── eco_assistant/             # Main package
│   ├── __init__.py            # Package initialization
│   ├── assistant.py           # ECOAssistant class
│   ├── utils.py               # Helper functions
│   └── config.py              # Configuration settings
├── examples/                  # Example scripts
│   └── basic_demo.py          # Simple demo
├── notebooks/                 # Jupyter notebooks
│   └── ECO-assistant.ipynb    # Original development notebook
├── SYNT_DOCS/                 # Synthetic ECO documents             
├── .env.example               # Template for API key
├── .gitignore                 
├── main.py                    # Main demo script
├── requirements.txt           # Dependencies
├── setup.py                   # Package installation
└── README.md                  # Project documentation

📁 Script Overview

File	Description
`main.py`	🔹 Primary demo script — runs a quick demonstration of the ECO Assistant's core capabilities, including document loading, simple Q&A, structured JSON output, and stakeholder email generation.
`eco_assistant/__init__.py`	Package initialization — defines package version, imports, and author information.
`eco_assistant/assistant.py`	ECOAssistant class — core implementation containing all functionality for analyzing Engineering Change Orders using RAG and Gemini.
`eco_assistant/utils.py`	Helper functions — utilities for document loading, vector database management, and API retry logic.
`eco_assistant/config.py`	Configuration settings — default parameters for models, chunking, and retrieval options.
`examples/basic_demo.py`	🔸 Lightweight demonstration script — a simpler version of the main demo focused on basic Q&A and document loading, with more explicit path handling.
`notebooks/ECO-assistant.ipynb`	🧠 Original development notebook — shows the exploratory and step-by-step creation of the ECO Assistant with detailed explanations and output examples.
`setup.py`	Package installation — configures package metadata and dependencies for installation.
`requirements.txt`	Dependencies — lists all required Python packages needed to run the assistant.
`.env.example`	API key template — shows how to structure the API key environment variables.
`README.md`	Project documentation — provides overview, installation instructions, and usage examples.

Getting Started

Prerequisites

Python 3.10+
Google Gemini API key (get one at Google AI Studio)

Installation

Clone this repository:

git clone https://github.com/AI-Data-Space/happymatrix-eco-assistant.git
cd happymatrix-eco-assistant

Create a virtual environment:

python -m venv .venv
source .venv/bin/activate  # On Windows: .venv\Scripts\activate

Install the package and dependencies:

pip install -r requirements.txt
pip install -e .

Create a .env file with your Google Gemini API key:
```
GOOGLE_API_KEY=your_api_key_here
```

API Key Security

⚠️ IMPORTANT: Never commit your .env file with your actual API key to version control. The .env file is listed in .gitignore and should remain private to your local development environment.

Running the Demo

Run the main demo script:

python main.py

This will demonstrate core functionality:

Loading and analyzing ECO documents
Natural language Q&A
Structured data extraction
Stakeholder email generation

For more examples:

python examples/basic_demo.py

Quick Start

from eco_assistant import ECOAssistant

Initialize with your API key

assistant = ECOAssistant(api_key="your-api-key")

Load documents

assistant.load_documents("path/to/docs") assistant.create_vector_store()

Ask a question

result = assistant.query("What change was made in ECO-100002?") print(result["result"])

Example Outputs

Here are examples of what you can expect when running the assistant:

Natural Language Q&A

Q: What change was made in ECO-100002 and why?
A: In ECO-100002, the lithium-polymer battery in the MatrixSync X100 was 
   replaced with a solid-state battery. This was done to improve battery 
   safety, increase product lifespan, and align with new supplier standards.

Structured JSON Output

{
  "ECO Number": "ECO-100002",
  "Title": "Battery Type Replacement – Lithium Polymer to Solid-State",
  "Description of Change": "Replaced lithium-polymer battery with solid-state battery in the MatrixSync X100.",
  "Reason for Change": "Improve battery safety, increase product lifespan, and align with new supplier standards.",
  "Affected Parts": [
    "BAT-000011 | Battery – Li-Po | Rev A → Obsolete",
    "BAT-000014 | Battery – Solid-State | New Part",
    "BOM-000122 | MatrixSync X100 BOM | Updated battery component"
  ],
  "Effective Date": "2025-05-05"
}

⚠️ API Rate Limits

This project uses the Google Gemini API which has usage limits on the free tier. The code includes built-in handling for rate limits, including:

Automatic retries with exponential backoff
User-friendly error messages
Strategic delays between API calls

If you encounter persistent rate limit errors when running examples:

Wait a few minutes before trying again
Run fewer operations in succession
Run only a subset of ECOs in batch processing examples
Consider a paid API tier for higher limits

These limitations are standard when working with AI APIs and demonstrate real-world API integration practices. The notebook is designed to handle these limitations gracefully while still showing the full capability of the assistant.

Note: If you plan to use this code in production, implementing a caching layer for embeddings and responses would significantly reduce API calls and improve performance.

Jupyter Notebook

This project evolved from a Jupyter notebook where I explored and developed the core concepts. The notebook contains:

Detailed exploration of the RAG implementation
Step-by-step development of the ECO Assistant
Visualizations and output examples
Comprehensive documentation of the approach

I've included the original notebook in this repository to show my development process and provide additional context for how the project was created. The packaged Python code in this repository is a refined, production-ready implementation of the concepts developed in the notebook.

To explore the development process:

jupyter notebook notebooks/ECO-assistant.ipynb

About the Data

All ECO documents included in this project are synthetic and were created solely for educational purposes. The documents are located in the SYNT_DOCS folder and represent fictional engineering change orders for the "MatrixSync X100" fitness tracker - a product that doesn't exist.

These synthetic documents demonstrate common patterns found in engineering change management but do not reflect any real products, companies, or proprietary information. They showcase various types of engineering changes including:

Hardware modifications
Component replacements
Material changes
Firmware updates

The synthetic nature of these documents makes this project suitable for educational use without concerns about intellectual property or confidential information.

Acknowledgements

This project builds upon foundation concepts and patterns learned from the Google Generative AI with Gemini API course on Kaggle. Specifically, the following implementation patterns were adapted from the course labs:

Basic Gemini API integration and configuration
RAG implementation framework using ChromaDB vector storage
Few-shot prompting techniques for consistent extraction
Structured output generation using LangChain

I've extended these concepts to create a specialized application for Engineering Change Order (ECO) processing, adding:

Domain-specific prompting for ECO document understanding
Comprehensive error handling and rate limit management
Structured data extraction pipeline for ECO metadata
Email generation for stakeholder communications
Batch processing capabilities for multiple ECOs

The course provided an excellent learning foundation that was then applied to this specialized domain to solve real-world engineering documentation challenges. All ECO documents are fictional and created for demonstration purposes only.

Licensing

Code

All source code in this repository is licensed under the MIT License.

Dataset

The dataset located in the SYNT_DOCS/ folder is licensed under the Creative Commons Attribution-NonCommercial 4.0 International (CC BY-NC 4.0).

Under the following terms:

Attribution — You must give appropriate credit, provide a link to the license, and indicate if changes were made.
NonCommercial — You may not use the dataset for commercial purposes.

This license applies only to the contents of the SYNT_DOCS/ directory.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

HappyMatrix ECO Assistant

Project Overview

Why This Matters

Features

Technologies

Technical Architecture

Project Structure

📁 Script Overview

Getting Started

Prerequisites

Installation

API Key Security

Running the Demo

Quick Start

Initialize with your API key

Load documents

Ask a question

Example Outputs

Natural Language Q&A

Structured JSON Output

⚠️ API Rate Limits

Jupyter Notebook

About the Data

Acknowledgements

Licensing

Code

Dataset

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 84 Commits
SYNT_DOCS		SYNT_DOCS
eco_assistant		eco_assistant
examples		examples
notebooks		notebooks
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt
setup.py		setup.py

License

AI-Data-Space/happymatrix-eco-assistant

Folders and files

Latest commit

History

Repository files navigation

HappyMatrix ECO Assistant

Project Overview

Why This Matters

Features

Technologies

Technical Architecture

Project Structure

📁 Script Overview

Getting Started

Prerequisites

Installation

API Key Security

Running the Demo

Quick Start

Initialize with your API key

Load documents

Ask a question

Example Outputs

Natural Language Q&A

Structured JSON Output

⚠️ API Rate Limits

Jupyter Notebook

About the Data

Acknowledgements

Licensing

Code

Dataset

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages