This project is a Retrieval-Augmented Generation (RAG) assistant for HubSpot documentation. It uses OpenAI to generate embeddings, Pinecone to store and search relevant document chunks, and FastAPI to serve a query endpoint.
- Embeds HubSpot documentation from
llms-full.txt
- Generates OpenAI embeddings and stores them in Pinecone
- Provides a FastAPI endpoint to query the documents using semantic search
- Optional: Streamlit interface for easy testing
hubspot-rag-app/
├── llms-full.txt # Raw documentation source
├── embed.py # Script to chunk and embed data into Pinecone
├── query.py # Command-line interface for querying the RAG system
├── main.py # FastAPI backend for querying and answering
├── streamlit_app.py # Optional Streamlit UI for local testing
├── .env # API keys and config
├── requirements.txt # Dependencies
- Clone the repository:
git clone <repository-url>
cd hubspot-rag-assistant
- Create a
.env
file with your API keys:
OPENAI_API_KEY=your-openai-api-key
PINECONE_API_KEY=your-pinecone-api-key
PINECONE_ENV=your-pinecone-environment
- Install dependencies:
pip install -r requirements.txt
To generate and store embeddings from the HubSpot documentation:
python embed.py
This script:
- Loads
llms-full.txt
- Splits it into chunks
- Generates OpenAI embeddings
- Stores them in Pinecone with IDs
uvicorn main:app --reload
The API will be available at http://localhost:8000
In a new terminal:
streamlit run streamlit_app.py
The Streamlit interface will be available at http://localhost:8501
🧪 Test with Postman
POST http://localhost:8000/ask
{
"question": "What are developer test accounts in HubSpot?"
}
{
"question": "What are developer test accounts in HubSpot?",
"answer": "Developer test accounts are free HubSpot environments that allow you to test apps and integrations...",
"sources": [
"Developer test accounts will expire after 90 days if no API calls...",
"You can create up to 10 test accounts per developer account..."
]
}
To test locally with a simple web interface:
streamlit run streamlit_app.py
You'll get a local UI where you can enter questions and see answers and source snippets like in the example below:
- Add auth and usage limits
- Deploy to cloud (e.g., Render/Fly.io)
- Enable document updates/re-indexing
- HubSpot Developer Docs: https://developers.hubspot.com/docs
- Built with OpenAI, Pinecone, FastAPI
For quick testing or integration with other tools, you can use the command-line interface:
python3 query.py
The CLI will:
- Prompt you to enter a question
- Show the most relevant documentation chunks
- Optionally generate a GPT-4 answer based on the chunks
Example usage:
Ask a question about HubSpot development: What are developer test accounts?
🔍 Top Chunks:
[Shows relevant documentation chunks]
🧠 Use GPT-4 to generate answer from these chunks? (y/n): y
✅ Answer:
[Shows generated answer]
- If you see "command not found" errors:
- Verify all dependencies are installed (
pip install -r requirements.txt
)
- Verify all dependencies are installed (