Skip to content

Add 06-service-recommender #44

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 18 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .gitignore
Original file line number Diff line number Diff line change
@@ -1,2 +1,2 @@
.DS_STORE
*.gguf
*.gguf
14 changes: 14 additions & 0 deletions 06-service-recommender/.env-template
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
# Copy this to a .env file and populate with secrets
# .env is used by Docker compose to set environment variables

OPENAI_API_KEY='...'

AWS_DEFAULT_REGION=us-east-1
AWS_ACCESS_KEY_ID='...'
AWS_SECRET_ACCESS_KEY='...'

# Optional: Enable authentication for Phoenix.
# A long string value that is used to sign JWTs for your deployment.
# It should be a good mix of characters and numbers and should be kept secret.
# https://arize.com/docs/phoenix/self-hosting/features/authentication
# PHOENIX_SECRET=20250613navalabs0000SomeLongAlphanumericString
1 change: 1 addition & 0 deletions 06-service-recommender/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
.env
94 changes: 94 additions & 0 deletions 06-service-recommender/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,94 @@

## Architecture

* When a user submits a question in the Streamlit UI (`frontend` subfolder), the frontend calls the API (`backend` subfolder) provided by the Hayhooks service.
* Haystack pipelines are registered with the Hayhooks service. Each pipeline is associated with an API endpoint by name, e.g., the pipeine defined in `backend/pipelines/first` can by triggered via a `POST` request to `http://localhost:1416/first/run`.
* During Haystack pipeline execution, traces are sent to the Phoenix observability tool (running as `phoenix` Docker compose service).
* Phoenix persists those traces to a DB (running as `db` Docker compose service).

### Details

* The Streamlit UI is at http://localhost:8501
* This can be replaced with a different frontend implemnentation.
* The Phoenix UI is at http://localhost:6006
* and receives tracing data (i.e., OTLP HTTP data) via `http://localhost:6006/v1/traces` (and OTLP gRPC data via `http://localhost:4317`)
* A Haystack pipeline can query Phoenix for prompt templates.
* Non-engineers can conduct prompt engineering in the Phoenix UI and have the changes be immediately reflected in pipelines that use the prompt template.
* The Hayhooks service listens on port 1416 -- API docs are at http://localhost:1416/docs
* New pipelines under a folder can be registered dynamically using `hayhooks pipeline deploy-files -n my_pipeline SOME_PIPELINES_FOLDER`
* Pipelines can be tested using `curl -X 'POST' 'http://localhost:1416/first/run' -H 'accept: application/json' -H 'Content-Type: application/json' -d '{ "question": "Who lives in Paris?" }'` or using the `hayhooks` commandline.
* The Hayhooks service can be replaced or supplemented with a separate API implementation that [calls the Hayhooks service](https://docs.haystack.deepset.ai/docs/hayhooks#running-programmatically) or runs Haystack pipelines directly.

## Running

### Create `.env` with API keys

Copy `.env-template` to a `.env` file and populate with your secrets. These secrets (e.g., `OPENAI_API_KEY`) are referenced in the `compose.yaml` file and are used by Haystack pipelines to call LLMs.

### Docker Compose

```
cd backend

# Add a prompt template to Phoenix for Haystack pipeline to use
uv run src/bootstrap.py

cd ..
```

Build and start all necessary Docker containers: `docker compose up --build`

### Running outside of Docker containers

The frontend and backend can be run outside of Docker containers if desired.

Install Python and `uv`.

Note the `requirements.txt` files are only for building Docker images. It is created by running `uv pip compile pyproject.toml -o requirements.txt` using dependencies declared in `pyproject.toml`.

Tip: During development, open the `frontend` and `backend` subfolders as different VSCode projects. For each project, have VSCode use the Python interpreter in the respective `.venv` subfolder.

#### Run Streamlit frontend
```
cd frontend
uv sync
uv run streamlit run src/main.py
```

Optionally, run `source .venv/bin/activate` to avoid having to type `uv run` or `uvx`.

#### Run API backend
```
# Export all variables in .env
set -o allexport
source .env

cd backend

# Download Python dependencies
uv sync

# Add a prompt template to Phoenix for Haystack pipeline to use
uv run src/bootstrap.py

# Start Hayhooks service
uvx hayhooks run --additional-python-path .
```

During pipeline development, test a Haystack pipeline before deploying it to Hayhooks.
For example: `uv run src/haystack_rag.py`

## To enable Phoenix authentication

Based on [documentation](https://arize.com/docs/phoenix/self-hosting/features/authentication), set `PHOENIX_SECRET` in `.env` and modify `compose.yaml` as follows.
* Add these environment variables to the `phoenix` service:
```
- PHOENIX_ENABLE_AUTH=True
- PHOENIX_SECRET=${PHOENIX_SECRET}
```
* Log into the Phoenix UI at http://localhost:6006 and create an API key
* Add these environment variables to the `backend` service:
```
- PHOENIX_API_KEY=${PHOENIX_API_KEY}
- OTEL_EXPORTER_OTLP_HEADERS=${OTEL_EXPORTER_OTLP_HEADERS}
```
1 change: 1 addition & 0 deletions 06-service-recommender/backend/.dockerignore
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
**/__pycache__
174 changes: 174 additions & 0 deletions 06-service-recommender/backend/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,174 @@
# Byte-compiled / optimized / DLL files
__pycache__/
*.py[cod]
*$py.class

# C extensions
*.so

# Distribution / packaging
.Python
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
share/python-wheels/
*.egg-info/
.installed.cfg
*.egg
MANIFEST

# PyInstaller
# Usually these files are written by a python script from a template
# before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec

# Installer logs
pip-log.txt
pip-delete-this-directory.txt

# Unit test / coverage reports
htmlcov/
.tox/
.nox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*.cover
*.py,cover
.hypothesis/
.pytest_cache/
cover/

# Translations
*.mo
*.pot

# Django stuff:
*.log
local_settings.py
db.sqlite3
db.sqlite3-journal

# Flask stuff:
instance/
.webassets-cache

# Scrapy stuff:
.scrapy

# Sphinx documentation
docs/_build/

# PyBuilder
.pybuilder/
target/

# Jupyter Notebook
.ipynb_checkpoints

# IPython
profile_default/
ipython_config.py

# pyenv
# For a library or package, you might want to ignore these files since the code is
# intended to run in multiple environments; otherwise, check them in:
# .python-version

# pipenv
# According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
# However, in case of collaboration, if having platform-specific dependencies or dependencies
# having no cross-platform support, pipenv may install dependencies that don't work, or not
# install all needed dependencies.
#Pipfile.lock

# UV
# Similar to Pipfile.lock, it is generally recommended to include uv.lock in version control.
# This is especially recommended for binary packages to ensure reproducibility, and is more
# commonly ignored for libraries.
#uv.lock

# poetry
# Similar to Pipfile.lock, it is generally recommended to include poetry.lock in version control.
# This is especially recommended for binary packages to ensure reproducibility, and is more
# commonly ignored for libraries.
# https://python-poetry.org/docs/basic-usage/#commit-your-poetrylock-file-to-version-control
#poetry.lock

# pdm
# Similar to Pipfile.lock, it is generally recommended to include pdm.lock in version control.
#pdm.lock
# pdm stores project-wide configurations in .pdm.toml, but it is recommended to not include it
# in version control.
# https://pdm.fming.dev/latest/usage/project/#working-with-version-control
.pdm.toml
.pdm-python
.pdm-build/

# PEP 582; used by e.g. github.com/David-OConnor/pyflow and github.com/pdm-project/pdm
__pypackages__/

# Celery stuff
celerybeat-schedule
celerybeat.pid

# SageMath parsed files
*.sage.py

# Environments
.env
.venv
env/
venv/
ENV/
env.bak/
venv.bak/

# Spyder project settings
.spyderproject
.spyproject

# Rope project settings
.ropeproject

# mkdocs documentation
/site

# mypy
.mypy_cache/
.dmypy.json
dmypy.json

# Pyre type checker
.pyre/

# pytype static type analyzer
.pytype/

# Cython debug symbols
cython_debug/

# PyCharm
# JetBrains specific template is maintained in a separate JetBrains.gitignore that can
# be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore
# and can be added to the global gitignore or merged into this file. For a more nuclear
# option (not recommended) you can uncomment the following to ignore the entire idea folder.
#.idea/

# Ruff stuff:
.ruff_cache/

# PyPI configuration file
.pypirc
1 change: 1 addition & 0 deletions 06-service-recommender/backend/.python-version
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
3.12
21 changes: 21 additions & 0 deletions 06-service-recommender/backend/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
# https://docs.haystack.deepset.ai/docs/docker
FROM deepset/haystack:base-v2.14.0

# Install wget for Docker Compose healthcheck
RUN apt-get update && apt-get install -y --no-install-recommends wget && \
rm -rf /var/lib/apt/lists/*

WORKDIR /app
COPY requirements.txt .
# Use --no-deps because transitive dependencies have already been included
# The `find ...` command removes unnecessary files to reduce the image size
RUN --mount=type=cache,target=/root/.cache/pip,sharing=locked pip install --no-cache-dir --no-deps -r requirements.txt; \
find /usr/local/lib \( -type d -a -name test -o -name tests \) -o \( -type f -a -name '*.pyc' -o -name '*.pyo' \) -exec rm -rf {} \;

COPY src .
# https://github.com/deepset-ai/hayhooks/tree/main/examples/shared_code_between_wrappers
ENV HAYHOOKS_ADDITIONAL_PYTHON_PATH=.

ENV LOG=DEBUG
# Haystack pipelines defined under pipelines folder are automatically deployed on Hayhooks startup
CMD ["hayhooks", "run", "--host", "0.0.0.0"]
18 changes: 18 additions & 0 deletions 06-service-recommender/backend/pyproject.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
[project]
name = "backend"
version = "0.1.0"
description = "Haystack pipelines"
readme = "README.md"
requires-python = ">=3.12"
dependencies = [
"arize-phoenix-otel>=0.10.3",
"haystack-ai>=2.14.2",
"openinference-instrumentation-haystack>=0.1.24",
"opentelemetry-sdk>=1.34.0",
"opentelemetry-exporter-otlp>=1.34.0",
"openai>=1.85.0",
"python-dotenv>=1.1.0",
"hayhooks>=0.8.0",
"arize-phoenix-client>=1.10.0",
"amazon-bedrock-haystack>=3.7.0",
]
Loading