markdown-body entry-content container-lg" itemprop="text">

📬 Stay Updated

Join Decoding ML for proven content on designing, coding, and deploying production-grade AI systems with software engineering and MLOps best practices to help you ship AI applications. Every week, straight to your inbox.

🚀 Installation and Usage Guide for the Second Brain Online Module

This guide will help you set up and run the Second Brain Online Module, which contains the code for Module 6: the RAG agentic app and LLMOps layer.

Note that this module is entirely independent of the offline ML pipelines. Thus, it comes with its own set of dependencies and requirements. Also, to run this module, you must first go through the steps from the Offline ML Pipelines part of the course, which will populate the vector database and other required resources.

📑 Table of Contents

📋 Prerequisites
🎯 Getting Started
📁 Project Structure
🏗️ Set Up Your Local Infrastructure
⚡️ Running the Code for Each Module
🔧 Utlity Commands

📋 Prerequisites

We depend on the same prerequisites as the offline ML pipelines. If modules 1 to 5 are working, you are good to go. Just make sure to fill in the .env file with the correct credentials.

Local Tools

For all the modules, you'll need the following tools installed locally:

Tool	Version	Purpose	Installation Link
Python	3.11	Programming language runtime	Download
uv	≥ 0.4.30	Python package installer and virtual environment manager	Download
GNU Make	≥ 3.81	Build automation tool	Download
Git	≥2.44.0	Version control	Download
Docker	≥27.4.0	Containerization platform	Download

📌 Windows users also need to install WSL (Click to expand)

We will be using Unix commands across the course, so if you are using Windows, you will need to install WSL, which will install a Linux kernel on your Windows machine and allow you to use the Unix commands from our course (this is the recommended way to write software on Windows).

🔗 Follow this guide to install WSL.

Cloud Services

Also, the course requires access to these cloud services. The authentication to these services is done by adding the corresponding environment variables to the .env file:

Service	Purpose	Cost	Environment Variable	Setup Guide	Starting with Module
OpenAI API	LLM API	Pay-per-use	`OPENAI_API_KEY`	Quick Start Guide	Module 2
Hugging Face	MLOps	Free tier	`HUGGINGFACE_ACCESS_TOKEN`	Quick Start Guide	Module 3
Comet	Experiment tracking	Free tier	`COMET_API_KEY`	Quick Start Guide	Module 4
Opik	LLM evaluation and prompt monitoring	Free tier (Hosted on Comet - same API Key)	`COMET_API_KEY`	Quick Start Guide	Module 6

When working locally, the infrastructure is set up using Docker. Thus, you can use the default values found in the config.py file for all the infrastructure-related environment variables.

But, in case you want to deploy the code, you'll need to setup the following services with their corresponding environment variables:

Service	Purpose	Cost	Required Credentials	Setup Guide
MongoDB	document database (with vector search)	Free tier	`MONGODB_URI`	1. Create a free MongoDB Atlas account 2. Create a Cluster 3. Add a Database User 4. Configure a Network Connection

🎯 Getting Started

1. Clone the Repository

Start by cloning the repository and navigating to the project directory:

git clone https://github.com/decodingml/second-brain-ai-assistant-course.git
cd second-brain-ai-assistant-course

2. Installation

First deactivate any active virtual environment and move to the second-brain-online directory:

deactivate
cd apps/second-brain-online

To install the dependencies and activate the virtual environment, run the following commands:

uv venv .venv-online
. ./.venv-online/bin/activate
uv pip install -e .

Note

The online application uses a different set of dependencies than the offline ML pipelines.

3. Environment Configuration

Before running any command, you have to set up your environment:

Create your environment file:
```
cp .env.example .env
```
Open .env and configure the required credentials following the inline comments and the recommendations from the Cloud Services section.

📁 Project Structure

At Decoding ML we teach how to build production ML systems. Thus, instead of splitting the code into separate modules, the course follows the structure of a real-world Python project:

.
├── configs/                   # ZenML configuration files
├── src/second_brain_online/  # Main package directory
│   ├── application/           # Application layer
│   ├── config.py              # Configuration settings
│   └── opik_utils.py          # Opik utility functions
├── tools/                     # Entrypoint scripts that use the Python package
├── .env.example               # Environment variables template
├── .python-version            # Python version specification
├── Makefile                   # Project commands
└── pyproject.toml             # Project dependencies

🏗️ Set Up Your Local Infrastructure

We use Docker to setup the local infrastructure (MongoDB).

Warning

Before running the command below, ensure you do not have any processes running on port 27017 (MongoDB).

To start the Docker infrastructure, run:

make local-infrastructure-up

To stop the Docker infrastructure, run:

make local-infrastructure-down

Note

To visualize the raw and RAG data from MongoDB, we recommend using MongoDB Compass or Mongo's official IDE plugin (e.g., MongoDB for VS Code). To connect to the working MongoDB instance, use the MONGODB_URI value from the .env file or found inside the config.py file.

⚡️ Running the Code for Each Module

NOTE: To run these modules, you must first go through the steps from the Offline ML Pipelines part of the course.

To simulate the course modules, we split the CLI commands and offline ML pipelines you must run per module so you know exactly where you are in the course.

Module 6: Running the RAG Agentic App and LLMOps Layer

Quickly test the agent from the CLI with a predefined query:

make run_agent_query RETRIEVER_CONFIG=configs/compute_rag_vector_index_openai_parent.yaml

You should see something like this:

Vector databases and vector indices are related concepts in the field of data storage and retrieval, particularly in contexts where high-dimensional vector representations of data are used, such as in machine learning and AI. Here are the key differences:

1. **Vector Databases**:
   - A vector database ...

Important

Be sure that the retriever config is the exact same one as the one used in Module 5 during the RAG feature pipeline to populate the vector database. If they don't match, the used retriever will use different settings resulting in errors or unexpected results. Here is a quick reminder of when to use which config:

Parent Retrieval with OpenAI models: configs/compute_rag_vector_index_openai_parent.yaml
Simple Contextual Retrieval with OpenAI models: configs/compute_rag_vector_index_openai_contextual_simple.yaml
Simple Contextual Retrieval with Hugging Face models: configs/compute_rag_vector_index_huggingface_contextual_simple.yaml
Full-fledged Contextual Retrieval with OpenAI models: configs/compute_rag_vector_index_openai_contextual.yaml

You can also spin-up a Gradio UI to test the agent with custom queries similar to any other chatbot:

make run_agent_app RETRIEVER_CONFIG=configs/compute_rag_vector_index_openai_parent.yaml

You should see something like this:

second_brain_ai_assistant_example.mp4

Evaluate the agent with our predefined evaluation queries (found under tools/evaluate_app.py):

make evaluate_agent RETRIEVER_CONFIG=configs/compute_rag_vector_index_openai_parent.yaml

After running the evaluation, open Opik to see the evaluation results, as seen in the image below:

For running the evaluation, plus playing around with the agent (~20 queries), the costs and running time are:

Running costs OpenAI: ~$0.5
Running costs Hugging Face Dedicated Endpoints (optional - you can use only the OpenAI models for summarization): ~$1 (the deployment costs $1 / hour)
Running time evaluation: ~15 minutes (the AI Assistant runs in real-time)

🔧 Utlity Commands

Formatting

make format-check
make format-fix

Linting

make lint-check
make lint-fix