This is a sample application written in Python that demonstrates how use LangChain to build applications with LLM integration. By using the SQL chain capabilities of LangChain in conjunction with an OpenAI LLM, the application can query a PostgreSQL-compatible YugabyteDB database from natural language.
- Install Python3
- Install Docker
Download the application and provide settings specific to your deployment:
-
Clone the repository.
git clone https://github.com/YugabyteDB-Samples/yugabytedb-langchain-openai-shoe-store-search.git
-
Install the application dependencies.
Dependencies can be installed in a virtual environment, or globally on your machine.
-
Option 1 (recommended): Install Dependencies from requirements.txt in virtual environment
python3 -m venv yb-langchain-env source yb-langchain-env/bin/activate pip install -r requirements.txt # NOTE: Users with M1 Mac machines should use requirements-m1.txt instead: # pip install -r requirements-m1.txt
-
Option 2: Install Dependencies Globally
pip install langchain pip install psycopg2 # NOTE: Users with M1 Mac machines should install the psycopg2 binary instead: # pip install psycopg2-binary pip install langchain_openai pip install langchain_experimental pip install flask pip install python-dotenv
-
-
Create an OpenAI API Key and store it's value in a secure location. This will be used to connect the application to the LLM to generate SQL queries and an appropriate response from the database.
-
Configure the application environment variables in
{project_directory/.env}
.
YugabyteDB is a PostgreSQL-compatible distributed database.
Start a 3-node YugabyteDB cluster in Docker (or feel free to use another deployment option):
# NOTE: if the ~/yb_docker_data already exists on your machine, delete and re-create it
mkdir ~/yb_docker_data
docker network create custom-network
docker run -d --name yugabytedb-node1 --net custom-network \
-p 15433:15433 -p 7001:7000 -p 9001:9000 -p 5433:5433 \
-v ~/yb_docker_data/node1:/home/yugabyte/yb_data --restart unless-stopped \
yugabytedb/yugabyte:2.20.1.0-b97 \
bin/yugabyted start \
--base_dir=/home/yugabyte/yb_data --background=false
docker run -d --name yugabytedb-node2 --net custom-network \
-p 15434:15433 -p 7002:7000 -p 9002:9000 -p 5434:5433 \
-v ~/yb_docker_data/node2:/home/yugabyte/yb_data --restart unless-stopped \
yugabytedb/yugabyte:2.20.1.0-b97 \
bin/yugabyted start --join=yugabytedb-node1 \
--base_dir=/home/yugabyte/yb_data --background=false
docker run -d --name yugabytedb-node3 --net custom-network \
-p 15435:15433 -p 7003:7000 -p 9003:9000 -p 5435:5433 \
-v ~/yb_docker_data/node3:/home/yugabyte/yb_data --restart unless-stopped \
yugabytedb/yugabyte:2.20.1.0-b97 \
bin/yugabyted start --join=yugabytedb-node1 \
--base_dir=/home/yugabyte/yb_data --background=false
The database connectivity settings are provided in the {project_dir}/.env
file and do not need to be changed if you started the cluster with the preceding command.
Navigate to the YugabyteDB UI to confirm that the database is up and running, at http://127.0.0.1:15433.
This application requires an e-commerce database with a product catalog and inventory information. This schema includes products
, users
, purchases
and product_inventory
. It also creates a read-only user role to prevent any destructive actions while querying the database directly from LangChain.
The pg_trgm
PostgreSQL extension is installed to execute similarity searches on alphanumeric text.
-
Copy the schema to the first node's Docker container.
docker cp {project_dir}/sql/schema.sql yugabytedb-node1:/home
-
Copy the seed data file to the Docker container.
docker cp {project_dir}/sql/generated_data.sql yugabytedb-node1:/home
-
Execute the SQL files against the database.
docker exec -it yugabytedb-node1 bin/ysqlsh -h yugabytedb-node1 -c '\i /home/schema.sql' docker exec -it yugabytedb-node1 bin/ysqlsh -h yugabytedb-node1 -c '\i /home/generated_data.sql'
The Flask server for this application exposes a REST endpoint which returns values from the database.
- Start the server.
python app.py
* Running on http://127.0.0.1:8080
- Send a POST request to the
/queries
endpoint with a relevant prompt. For instance:
# What purchases have been made by user1?
# What colors do the Intelligent Racer come in?
# How many narrow shoes come in pink?
# Find me shoes that are in stock and available in size 15.
curl -X POST http://localhost:8080/queries -H "Content-Type: application/json" -d '{"user_prompt":"Find me shoes that are in stock and available in size 15."}'
[
{
"color": [
"yellow",
"blue"
],
"description": null,
"name": "Efficient Jogger 1",
"price": "110.08",
"quantity": 22,
"width": [
"narrow",
"wide"
]
},
{
"color": [
"blue",
"yellow"
],
"description": null,
"name": "Efficient Jogger 8",
"price": "143.63",
"quantity": 85,
"width": [
"wide",
"narrow"
]
},
...
]