Skip to content

Commit 7e66869

Browse files
authored
Merge pull request #283 from vbedida79/patch-290724-1
workloads: Added opea-chatqna workload yaml's & readme
2 parents a0bc5b0 + e3ecead commit 7e66869

7 files changed

+508
-0
lines changed

workloads/opea/chatqna/README.md

Lines changed: 132 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,132 @@
1+
# Deploy OPEA ChatQnA workload on OCP
2+
3+
## Overview
4+
The workload is based on the [OPEA ChatQnA Application](https://github.com/opea-project/GenAIExamples/tree/v0.8/ChatQnA) running on Intel® Gaudi Accelerator with OpenShift and OpenShift AI. Refer to the [OPEA Generative AI Examples](https://github.com/opea-project/GenAIExamples/tree/v0.8) for more details about the OPEA workloads.
5+
6+
**Note**: It is still under heavy development, and the updates are expected.
7+
8+
## Prerequisites
9+
* Provisioned RHOCP cluster. Follow steps [here](/README.md#provisioning-rhocp-cluster)
10+
* The Persistent storage using NFS is ready. Refer to [documentation](https://docs.openshift.com/container-platform/4.16/storage/persistent_storage/persistent-storage-nfs.html) for the details to set it up.
11+
12+
**Note**: Refer to [documentation](https://docs.openshift.com/container-platform/4.16/storage/index.html) for setting up other types of persistent storages.
13+
* Provisioned Intel Gaudi accelerator on RHOCP cluster. Follow steps [here](/gaudi/README.md)
14+
* RHOAI is installed. Follow steps [here](../inference/README.md/#install-rhoai)
15+
* The Intel Gaudi AI accelerator is enabled with RHOAI. Follow steps [here]((../inference/README.md/#enable-intel-gaudi-ai-accelerator-with-rhoai))
16+
* Minio based S3 service ready for RHOAI. Follow steps [here](https://ai-on-openshift.io/tools-and-applications/minio/minio/#create-a-matching-data-connection-for-minio)
17+
18+
## Deploy Model Serving for OPEA ChatQnA Microservices with RHOAI
19+
20+
### Create OpenShift AI Data Science Project
21+
22+
* Click ```Search -> Routes -> rhods-dashboard``` from the OCP web console and launch the RHOAI dashboard.
23+
24+
* Follow the dashboard and click ```Data Science Projects``` to create a project. For example, ```OPEA-chatqna-modserving```.
25+
26+
### Preload the models
27+
28+
* Refer to [link](https://huggingface.co/docs/hub/en/models-downloading) and download the model [Llama2-70b-chat-hf](https://huggingface.co/meta-llama/Llama-2-70b-chat-hf).
29+
30+
* Refer to [link](https://ai-on-openshift.io/tools-and-applications/minio/minio/#create-a-matching-data-connection-for-minio) and upload the model to minio/s3 storage.
31+
32+
* Click ```OPEA-chatqna-modserving```, and choose ```Data Connection``` section. In the fields, add your access and secret keys from minio. Follow [link](https://ai-on-openshift.io/tools-and-applications/minio/minio/#create-a-matching-data-connection-for-minio).
33+
34+
### Launch the Model Serving with Intel Gaudi AI Accelerator
35+
36+
* Click on the Settings and choose ```ServingRuntime```. Copy or import the [tgi_gaudi_servingruntime.yaml](tgi-gaudi-servingruntime.yaml). The [tgi-gaudi](https://github.com/huggingface/tgi-gaudi) serving runtime is used. Follow the image below.
37+
38+
* In the project ```OPEA-chatqna-modserving``` --> ```Models``` section and follow the image below.
39+
40+
* The model server is now in the creation state. Once ready, the status will be updated to green and the inference endpoint can be seen. Refer to the image below.
41+
42+
## Deploy ChatQnA Megaservice and Database
43+
44+
### Create namespace
45+
46+
```
47+
oc create namespace opea-chatqna
48+
```
49+
50+
### Create persistent volumes
51+
The NFS is used to create the Persistent Volumes for ChatQnA MegaService to claim and use.
52+
53+
Make sure to update NFS server IP and path in ```persistent_volumes.yaml``` before applying command below.
54+
For example:
55+
```
56+
nfs:
57+
server: 10.20.1.2 # nfs server
58+
path: /my_nfs # nfs path
59+
```
60+
61+
```
62+
$ oc apply -f https://raw.githubusercontent.com/intel/intel-technology-enabling-for-openshift/main/workloads/opea/chatqna/persistent_volumes.yaml
63+
64+
```
65+
66+
* Check that the persistent volumes are created:
67+
68+
```
69+
$ oc get pv
70+
NAME CAPACITY ACCESS MODES RECLAIM POLICY STATUS
71+
chatqna-megaservice-pv-0 100Mi RWO Retain Available
72+
chatqna-megaservice-pv-1 100Mi RWO Retain Available
73+
chatqna-megaservice-pv-2 100Mi RWO Retain Available
74+
75+
```
76+
### Building OPEA ChatQnA MegaService Container Image
77+
```
78+
create_megaservice_container.sh
79+
```
80+
81+
### Deploy Redis Vector Database Service
82+
```
83+
$ oc apply -f https://raw.githubusercontent.com/intel/intel-technology-enabling-for-openshift/main/workloads/opea/chatqna/redis_deployment_service.yaml
84+
85+
```
86+
87+
Check that the pod and service are running:
88+
89+
```
90+
$ oc get pods
91+
NAME READY STATUS RESTARTS AGE
92+
redis-vector-db-6b5747bf7-sl8fr 1/1 Running 0 21s
93+
```
94+
95+
```
96+
$ oc get svc
97+
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
98+
redis-vector-db ClusterIP 1.2.3.4 <none> 6379/TCP,8001/TCP 43s
99+
```
100+
101+
### Deploy ChatQnA MegaService
102+
103+
Update the inference endpoint from the <image name> in the chatqna_megaservice_deployment.
104+
105+
```
106+
$ oc apply -f https://raw.githubusercontent.com/intel/intel-technology-enabling-for-openshift/main/workloads/opea/chatqna/chatqna_megaservice_deployment.yaml
107+
```
108+
109+
Check that the pod and service are running:
110+
111+
```
112+
$ oc get pods
113+
NAME READY STATUS RESTARTS AGE
114+
chatqna-megaservice-54487649b5-sgsh2 1/1 Running 0 95s
115+
```
116+
117+
```
118+
$ oc get svc
119+
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
120+
chatqna-megaservice ClusterIP 1.2.3.4 <none> 8000/TCP 99s
121+
```
122+
123+
### Verify the Megaservice
124+
Use the command below:
125+
126+
```
127+
curl <megaservice_pod_ip>/v1/rag/chat_stream \
128+
-X POST \
129+
-d '{"query":"What is a constellation?"}' \
130+
-H 'Content-Type: application/json'
131+
132+
```
Lines changed: 60 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,60 @@
1+
# Copyright (c) 2024 Intel Corporation
2+
# SPDX-License-Identifier: Apache-2.0
3+
4+
apiVersion: image.openshift.io/v1
5+
kind: ImageStream
6+
metadata:
7+
name: chatqna-megaservice
8+
namespace: opea-chatqna
9+
spec: {}
10+
---
11+
apiVersion: build.openshift.io/v1
12+
kind: BuildConfig
13+
metadata:
14+
name: chatqna-megaservice
15+
namespace: opea-chatqna
16+
spec:
17+
triggers:
18+
- type: "ConfigChange"
19+
- type: "ImageChange"
20+
runPolicy: "Serial"
21+
source:
22+
dockerfile: |
23+
FROM langchain/langchain:latest
24+
25+
RUN apt-get update -y && apt-get install -y --no-install-recommends --fix-missing \
26+
libgl1-mesa-glx \
27+
libjemalloc-dev
28+
29+
RUN useradd -m -s /bin/bash user && \
30+
mkdir -p /home/user && \
31+
chown -R user /home/user/
32+
33+
USER user
34+
COPY requirements.txt /tmp/requirements.txt
35+
36+
USER root
37+
COPY tls.crt /rhoai-ca/tls.crt
38+
RUN cat /rhoai-ca/tls.crt | tee -a '/usr/lib/ssl/cert.pem'
39+
40+
USER user
41+
RUN pip install --no-cache-dir --upgrade pip && \
42+
pip install --no-cache-dir -r /tmp/requirements.txt
43+
44+
ENV PYTHONPATH=$PYTHONPATH:/ws:/home/user:/home/user/qna-app/app
45+
46+
WORKDIR /home/user/qna-app
47+
COPY qna-app /home/user/qna-app
48+
49+
ENTRYPOINT ["/usr/bin/sleep", "infinity"]
50+
triggers:
51+
- type: ConfigChange
52+
runPolicy: SerialLatestOnly
53+
strategy:
54+
type: Docker
55+
dockerStrategy: {}
56+
postCommit: {}
57+
output:
58+
to:
59+
kind: ImageStreamTag
60+
name: chatqna-megaservice:latest
Lines changed: 121 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,121 @@
1+
# Copyright (c) 2024 Intel Corporation
2+
# SPDX-License-Identifier: Apache-2.0
3+
---
4+
apiVersion: v1
5+
kind: PersistentVolumeClaim
6+
metadata:
7+
name: chatqna-megaservice-pvc-0
8+
namespace: opea-chatqna
9+
spec:
10+
accessModes:
11+
- ReadWriteOnce
12+
resources:
13+
requests:
14+
storage: 100Mi
15+
---
16+
apiVersion: v1
17+
kind: PersistentVolumeClaim
18+
metadata:
19+
name: chatqna-megaservice-pvc-1
20+
namespace: opea-chatqna
21+
spec:
22+
accessModes:
23+
- ReadWriteOnce
24+
resources:
25+
requests:
26+
storage: 100Mi
27+
---
28+
apiVersion: v1
29+
kind: PersistentVolumeClaim
30+
metadata:
31+
name: chatqna-megaservice-pvc-2
32+
namespace: opea-chatqna
33+
spec:
34+
accessModes:
35+
- ReadWriteOnce
36+
resources:
37+
requests:
38+
storage: 100Mi
39+
---
40+
apiVersion: apps/v1
41+
kind: Deployment
42+
metadata:
43+
name: chatqna-megaservice
44+
namespace: opea-chatqna
45+
spec:
46+
selector:
47+
matchLabels:
48+
app: chatqna-megaservice
49+
replicas: 1
50+
template:
51+
metadata:
52+
labels:
53+
app: chatqna-megaservice
54+
spec:
55+
serviceAccount: opea-chatqna
56+
containers:
57+
- name: chatqna-megaservice
58+
image: 'image-registry.openshift-image-registry.svc:5000/opea-chatqna/chatqna-megaservice:latest'
59+
env:
60+
- name: EMBED_MODEL
61+
value: BAAI/bge-base-en-v1.5
62+
- name: HUGGINGFACEHUB_API_TOKEN
63+
valueFrom:
64+
secretKeyRef:
65+
key: HUGGINGFACEHUB_API_TOKEN
66+
name: hf-token
67+
- name: MODEL_SIZE
68+
value: 70b
69+
- name: PYTHONPATH
70+
value: $PYTHONPATH:/ws:/home/user:/home/user/qna-app/app
71+
- name: RAG_UPLOAD_DIR
72+
value: /upload_dir
73+
- name: REDIS_PORT
74+
value: "6379"
75+
- name: REDIS_HOST
76+
value: "redis-vector-db"
77+
- name: REDIS_SCHEMA
78+
value: schema_dim_768.yml
79+
- name: TGI_ENDPOINT
80+
value: http://xxx.xxx.xxx.xxx:xxx
81+
ports:
82+
- containerPort: 8000
83+
command:
84+
- /bin/bash
85+
- '-c'
86+
- |
87+
cd /ws && \
88+
python ingest.py /ws/data_intel/ && \
89+
cd /home/user/qna-app && \
90+
python app/server.py
91+
volumeMounts:
92+
- mountPath: /ws
93+
name: chatqna-megaservice-pvc-0
94+
- mountPath: /test
95+
name: chatqna-megaservice-pvc-1
96+
- mountPath: /upload_dir
97+
name: chatqna-megaservice-pvc-2
98+
volumes:
99+
- name: chatqna-megaservice-pvc-0
100+
persistentVolumeClaim:
101+
claimName: chatqna-megaservice-pvc-0
102+
- name: chatqna-megaservice-pvc-1
103+
persistentVolumeClaim:
104+
claimName: chatqna-megaservice-pvc-1
105+
- name: chatqna-megaservice-pvc-2
106+
persistentVolumeClaim:
107+
claimName: chatqna-megaservice-pvc-2
108+
---
109+
# Chatqna megaservice Service
110+
apiVersion: v1
111+
kind: Service
112+
metadata:
113+
name: chatqna-megaservice
114+
namespace: opea-chatqna
115+
spec:
116+
type: ClusterIP
117+
selector:
118+
app: chatqna-megaservice
119+
ports:
120+
- port: 8000
121+
targetPort: 8000
Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
# Copyright (c) 2024 Intel Corporation
2+
# SPDX-License-Identifier: Apache-2.0
3+
#!/bin/sh
4+
5+
tag="v0.8"
6+
namespace="opea-chatqna"
7+
repo="https://github.com/opea-project/GenAIExamples.git"
8+
yaml_url="https://raw.githubusercontent.com/intel/intel-technology-enabling-for-openshift/main/workloads/opea/chatqna/chatqna_megaservice_buildconfig.yaml"
9+
10+
oc $namespace &&
11+
git clone --depth 1 --branch $tag $repo &&
12+
cd GenAIExamples/ChatQnA/deprecated/langchain/docker &&
13+
oc extract secret/knative-serving-cert -n istio-system --to=. --keys=tls.crt &&
14+
oc apply -f $yaml_url &&
15+
oc start-build chatqna-megaservice --from-dir=./ --follow
Lines changed: 44 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,44 @@
1+
# Copyright (c) 2024 Intel Corporation
2+
# SPDX-License-Identifier: Apache-2.0
3+
---
4+
apiVersion: v1
5+
kind: PersistentVolume
6+
metadata:
7+
name: chatqna-megaservice-pv-0
8+
spec:
9+
capacity:
10+
storage: 100Mi
11+
accessModes:
12+
- ReadWriteOnce
13+
persistentVolumeReclaimPolicy: Retain
14+
nfs:
15+
server: x.x.x.x # nfs server
16+
path: /nfs # nfs path
17+
---
18+
apiVersion: v1
19+
kind: PersistentVolume
20+
metadata:
21+
name: chatqna-megaservice-pv-1
22+
spec:
23+
capacity:
24+
storage: 100Mi
25+
accessModes:
26+
- ReadWriteOnce
27+
persistentVolumeReclaimPolicy: Retain
28+
nfs:
29+
server: x.x.x.x # nfs server
30+
path: /nfs # nfs path
31+
---
32+
apiVersion: v1
33+
kind: PersistentVolume
34+
metadata:
35+
name: chatqna-megaservice-pv-2
36+
spec:
37+
capacity:
38+
storage: 100Mi
39+
accessModes:
40+
- ReadWriteOnce
41+
persistentVolumeReclaimPolicy: Retain
42+
nfs:
43+
server: x.x.x.x # nfs server
44+
path: /nfs # nfs path

0 commit comments

Comments
 (0)