Click the play button in the top right corner of the GIF to view the demo.
This project is a hybrid search Flask application that integrates traditional full-text (lexical/BM25) search with semantic (neural sparse embeddings) search. It also provides autocomplete functionality for an improved search experience.
- The project is an adaptation/extension of the Elastic Search app in elasticsearch-labs.
- The autocomplete query is a match_bool_prefix across multiple fields (with a disjunction maximum). It was adapted from the search service implementation in the Amazon retail-demo-store project.
The primary adjustments to the Elastic Search app are:
- OpenSearch is used as the search engine instead of Elasticsearch
- A Neural Sparse Encoder, specifically opensearch-neural-sparse-encoding-v2-distill is replaces Elasticsearch's Elastic Learned Sparse EncodeR model (ELSER).
- Autocomplete has been added.
- The search box persists across the pages.
- OpenSearch
- Docker
- Flask
-
Start your Docker app or ensure it's running.
-
Clone the project and navigate to the project's root directory.
git clone https://github.com/MustaphaU/opensearch-hybrid-search.git && cd opensearch-hybrid-search
-
Set the
OPENSEARCH_INITIAL_ADMIN_PASSWORD
to a strong password.- Replace
{yourStrongPassword123!}
with your intended password and run the resulting command in your terminal. - A valid password must contain a mix of upper and lower case alphanumeric characters and a special character. For example,
Myadminp@ss12321
export OPENSEARCH_INITIAL_ADMIN_PASSWORD={yourStrongPassword123!}
- Replace
-
Run docker-compose to start OpenSearch in Docker (in detached mode).
bash docker compose up -d
This command launches the services defined in docker-compose.yaml in detached mode. It will:- Pull the latest
opensearch
andopensearch-dashboards
images. - Start three containers:
- Two OpenSearch cluster nodes:
opensearch-node1
andopensearch-node2
- One dashboard instance:
opensearch-dashboards
- Two OpenSearch cluster nodes:
- Automatically use the
OPENSEARCH_INITIAL_ADMIN_PASSWORD
from your environment for secure setup.
Once the containers are running, OpenSearch and its dashboard will be available for use.
- Pull the latest
-
After the containers have started, access the OpenSearch Dashboards UI by navigating to the following URL in your browser:
http://localhost:5002/
When prompted, log in with:
- Username:
admin
- Password: The value you set for
OPENSEARCH_INITIAL_ADMIN_PASSWORD
- Username:
-
Create a .env file in the project's root directory and add your
OPENSEARCH_INITIAL_ADMIN_PASSWORD
:opensearch-hybrid-search/.env
OPENSEARCH_INITIAL_ADMIN_PASSWORD={yourStrongPassword123!}
-
Create and activate a conda environment (auto-accepting all prompts):
conda create -y -n opensearch_env python=3.12 conda activate opensearch_env
-
Install all requirements.
pip install -r requirements.txt
-
Run below command to:
- Update cluster settings for model management
- Register and the deploy models
- Create ingest and hybrid search pipelines
- Create index and ingest the data
flask update-cluster-settings && flask deploy-models && flask create-pipelines && flask reindex
-
Start the search app
flask run