A monorepo containing the Aptos documentation chatbot and related packages. This project combines modern LLMs with Aptos-specific knowledge to provide accurate and contextual responses through RAG (Retrieval-Augmented Generation).
- RAG implementation using LangChain and OpenAI embeddings
- FastAPI backend with streaming support
- Modular React components with Tailwind CSS
- Vector storage using FAISS
- Real-time chat interface with message history
- Enhanced semantic search with topic-based chunking
- Monorepo structure with shared TypeScript configurations
.
├── packages/ # Shared packages
│ ├── chatbot-core/ # Core chatbot functionality
│ ├── chatbot-react/ # React hooks and context
│ ├── chatbot-ui-base/ # Base UI components
│ ├── chatbot-ui-tailwind/ # Tailwind styled components
│ └── tsconfig/ # Shared TypeScript configs
├── apps/ # Applications
│ └── demo/ # Demo application
├── app/ # Backend application
│ ├── main.py # FastAPI application
│ ├── models.py # Pydantic models
│ ├── rag_providers/ # RAG providers
│ ├── routes/ # API routes
│ └── utils/ # Utility functions
├── scripts/ # Utility scripts
│ ├── preprocess_topic_chunks.py # Topic preprocessing
│ └── run_preprocessing.sh # Preprocessing script
├── data/ # RAG data storage
├── requirements.txt # Python dependencies
└── pnpm-workspace.yaml # PNPM workspace config
- Create a virtual environment:
python -m venv venv
source venv/bin/activate # On Windows: .\venv\Scripts\activate
- Install dependencies:
pip install -r requirements.txt
- Create a
.env
file with your API keys:
OPENAI_API_KEY=your_openai_api_key
CHAT_TEST_MODE=false
DEFAULT_RAG_PROVIDER=topic
- Install dependencies:
pnpm install
- Build packages:
pnpm build
- Start the development server:
pnpm dev
pnpm build
- Build all packagespnpm dev
- Start development serverspnpm lint
- Lint all packagespnpm format
- Format code using Prettierpnpm format:check
- Check code formatting
The project uses Prettier for code formatting with the following configuration:
{
"semi": true,
"trailingComma": "all",
"singleQuote": true,
"printWidth": 100,
"tabWidth": 2,
"useTabs": false,
"bracketSpacing": true,
"arrowParens": "avoid"
}
The Retrieval-Augmented Generation (RAG) system follows this process:
- Query Processing: User query is received and processed
- Provider Selection: The appropriate RAG provider is selected (default: topic-based)
- Vector Search: Query is converted to an embedding and used to search the vector store
- Topic-Based Enhancement: Retrieved chunks are enhanced with related documents based on topic similarity
- Context Formatting: Retrieved chunks are formatted into a context string
- Prompt Creation: Context is inserted into a system prompt template
- LLM Generation: The prompt is sent to the LLM for response generation
- Response Streaming: The generated response is streamed back to the user
The application uses topic-based RAG by default for improved context retrieval. Before first use, ensure the enhanced chunks are generated:
./scripts/run_preprocessing.sh
This project is licensed under the terms of the license found in the LICENSE file.