workloads: Added opea-chatqna workload yaml's & readme

vbedida79 · vbedida79 · commit e3ecead5cc87 · 2024-07-31T22:09:28.000-04:00
Signed-off-by: vbedida79 &lt;veenadhari.bedida@intel.com&gt;
diff --git a/workloads/opea/chatqna/README.md b/workloads/opea/chatqna/README.md
@@ -0,0 +1,132 @@
+# Deploy OPEA ChatQnA workload on OCP
+
+## Overview
+The workload is based on the [OPEA ChatQnA Application](https://github.com/opea-project/GenAIExamples/tree/v0.8/ChatQnA) running on Intel® Gaudi Accelerator with OpenShift and OpenShift AI. Refer to the [OPEA Generative AI Examples](https://github.com/opea-project/GenAIExamples/tree/v0.8) for more details about the OPEA workloads.
+
+**Note**: It is still under heavy development, and the updates are expected.
+ 
+## Prerequisites
+* Provisioned RHOCP cluster. Follow steps [here](/README.md#provisioning-rhocp-cluster)
+* The Persistent storage using NFS is ready. Refer to [documentation](https://docs.openshift.com/container-platform/4.16/storage/persistent_storage/persistent-storage-nfs.html) for the details to set it up.
+
+    **Note**: Refer to [documentation](https://docs.openshift.com/container-platform/4.16/storage/index.html) for setting up other types of persistent storages.
+* Provisioned Intel Gaudi accelerator on RHOCP cluster. Follow steps [here](/gaudi/README.md)
+* RHOAI is installed. Follow steps [here](../inference/README.md/#install-rhoai) 
+* The Intel Gaudi AI accelerator is enabled with RHOAI. Follow steps [here]((../inference/README.md/#enable-intel-gaudi-ai-accelerator-with-rhoai))
+* Minio based S3 service ready for RHOAI. Follow steps [here](https://ai-on-openshift.io/tools-and-applications/minio/minio/#create-a-matching-data-connection-for-minio)
+
+## Deploy Model Serving for OPEA ChatQnA Microservices with RHOAI
+
+### Create OpenShift AI Data Science Project
+
+* Click ```Search -> Routes -> rhods-dashboard``` from the OCP web console and launch the RHOAI dashboard. 
+
+* Follow the dashboard and click ```Data Science Projects``` to create a project. For example, ```OPEA-chatqna-modserving```.
+
+### Preload the models
+
+* Refer to [link](https://huggingface.co/docs/hub/en/models-downloading) and download the model [Llama2-70b-chat-hf](https://huggingface.co/meta-llama/Llama-2-70b-chat-hf). 
+
+* Refer to [link](https://ai-on-openshift.io/tools-and-applications/minio/minio/#create-a-matching-data-connection-for-minio) and upload the model to minio/s3 storage. 
+
+* Click ```OPEA-chatqna-modserving```, and choose ```Data Connection``` section. In the fields, add your access and secret keys from minio. Follow [link](https://ai-on-openshift.io/tools-and-applications/minio/minio/#create-a-matching-data-connection-for-minio). 
+
+### Launch the Model Serving with Intel Gaudi AI Accelerator
+
+* Click on the Settings and choose ```ServingRuntime```. Copy or import the [tgi_gaudi_servingruntime.yaml](tgi-gaudi-servingruntime.yaml). The [tgi-gaudi](https://github.com/huggingface/tgi-gaudi) serving runtime is used. Follow the image below.
+
+* In the project ```OPEA-chatqna-modserving``` --> ```Models``` section and follow the image below.
+
+* The model server is now in the creation state. Once ready, the status will be updated to green and the inference endpoint can be seen. Refer to the image below. 
+
+## Deploy ChatQnA Megaservice and Database
+
+### Create namespace 
+
+``` 
+  oc create namespace opea-chatqna
+```
+
+### Create persistent volumes
+The NFS is used to create the Persistent Volumes for ChatQnA MegaService to claim and use.
+
+Make sure to update NFS server IP and path in ```persistent_volumes.yaml``` before applying command below.
+For example:
+```
+  nfs:
+    server: 10.20.1.2 # nfs server
+    path: /my_nfs # nfs path
+```
+  
+``` 
+$ oc apply -f https://raw.githubusercontent.com/intel/intel-technology-enabling-for-openshift/main/workloads/opea/chatqna/persistent_volumes.yaml
+
+```
+
+* Check that the persistent volumes are created:
+
+```
+$ oc get pv
+NAME                           CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS      
+chatqna-megaservice-pv-0        100Mi      RWO            Retain           Available
+chatqna-megaservice-pv-1        100Mi      RWO            Retain           Available
+chatqna-megaservice-pv-2        100Mi      RWO            Retain           Available
+
+```
+### Building OPEA ChatQnA MegaService Container Image
+```
+create_megaservice_container.sh
+```
+
+### Deploy Redis Vector Database Service
+```
+$ oc apply -f https://raw.githubusercontent.com/intel/intel-technology-enabling-for-openshift/main/workloads/opea/chatqna/redis_deployment_service.yaml
+
+```
+
+Check that the pod and service are running:
+
+```
+$ oc get pods
+NAME                                   READY   STATUS      RESTARTS   AGE
+redis-vector-db-6b5747bf7-sl8fr        1/1     Running     0          21s
+```
+
+```
+$ oc get svc
+NAME                  TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)             AGE
+redis-vector-db       ClusterIP   1.2.3.4          <none>        6379/TCP,8001/TCP   43s
+```
+
+### Deploy ChatQnA MegaService
+
+Update the inference endpoint from the <image name> in the chatqna_megaservice_deployment.
+
+```
+$ oc apply -f https://raw.githubusercontent.com/intel/intel-technology-enabling-for-openshift/main/workloads/opea/chatqna/chatqna_megaservice_deployment.yaml
+```
+
+Check that the pod and service are running:
+
+```
+$ oc get pods
+NAME                                   READY   STATUS      RESTARTS   AGE
+chatqna-megaservice-54487649b5-sgsh2   1/1     Running     0          95s         
+```
+
+```
+$ oc get svc
+NAME                  TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)             AGE
+chatqna-megaservice   ClusterIP   1.2.3.4          <none>        8000/TCP            99s
+```
+
+### Verify the Megaservice
+Use the command below:
+
+```
+  curl <megaservice_pod_ip>/v1/rag/chat_stream \
+  -X POST \
+  -d '{"query":"What is a constellation?"}' \
+  -H 'Content-Type: application/json'
+
+```
diff --git a/workloads/opea/chatqna/chatqna_megaservice_buildconfig.yaml b/workloads/opea/chatqna/chatqna_megaservice_buildconfig.yaml
@@ -0,0 +1,60 @@
+# Copyright (c) 2024 Intel Corporation
+# SPDX-License-Identifier: Apache-2.0
+
+apiVersion: image.openshift.io/v1
+kind: ImageStream
+metadata:
+  name: chatqna-megaservice
+  namespace: opea-chatqna
+spec: {}
+---
+apiVersion: build.openshift.io/v1
+kind: BuildConfig
+metadata:
+  name: chatqna-megaservice
+  namespace: opea-chatqna
+spec:
+  triggers:
+    - type: "ConfigChange"
+    - type: "ImageChange"
+  runPolicy: "Serial"
+  source:
+    dockerfile: |
+      FROM langchain/langchain:latest
+
+      RUN apt-get update -y && apt-get install -y --no-install-recommends --fix-missing \
+      libgl1-mesa-glx \
+      libjemalloc-dev
+
+      RUN useradd -m -s /bin/bash user && \
+      mkdir -p /home/user && \
+      chown -R user /home/user/
+
+      USER user
+      COPY requirements.txt /tmp/requirements.txt
+      
+      USER root
+      COPY tls.crt /rhoai-ca/tls.crt
+      RUN cat /rhoai-ca/tls.crt  | tee -a '/usr/lib/ssl/cert.pem'
+
+      USER user
+      RUN pip install --no-cache-dir --upgrade pip && \
+      pip install --no-cache-dir -r /tmp/requirements.txt
+
+      ENV PYTHONPATH=$PYTHONPATH:/ws:/home/user:/home/user/qna-app/app
+
+      WORKDIR /home/user/qna-app
+      COPY qna-app /home/user/qna-app
+
+      ENTRYPOINT ["/usr/bin/sleep", "infinity"]
+  triggers:
+    - type: ConfigChange
+  runPolicy: SerialLatestOnly
+  strategy:
+    type: Docker
+    dockerStrategy: {}
+  postCommit: {}
+  output:
+    to:
+      kind: ImageStreamTag
+      name: chatqna-megaservice:latest
diff --git a/workloads/opea/chatqna/chatqna_megaservice_deployment.yaml b/workloads/opea/chatqna/chatqna_megaservice_deployment.yaml
@@ -0,0 +1,121 @@
+# Copyright (c) 2024 Intel Corporation
+# SPDX-License-Identifier: Apache-2.0
+---
+apiVersion: v1
+kind: PersistentVolumeClaim
+metadata:
+  name: chatqna-megaservice-pvc-0
+  namespace: opea-chatqna
+spec:
+  accessModes:
+    - ReadWriteOnce
+  resources:
+    requests:
+      storage: 100Mi
+---
+apiVersion: v1
+kind: PersistentVolumeClaim
+metadata:
+  name: chatqna-megaservice-pvc-1
+  namespace: opea-chatqna
+spec:
+  accessModes:
+    - ReadWriteOnce
+  resources:
+    requests:
+      storage: 100Mi
+---
+apiVersion: v1
+kind: PersistentVolumeClaim
+metadata:
+  name: chatqna-megaservice-pvc-2
+  namespace: opea-chatqna
+spec:
+  accessModes:
+    - ReadWriteOnce
+  resources:
+    requests:
+      storage: 100Mi
+---
+apiVersion: apps/v1
+kind: Deployment
+metadata:
+  name: chatqna-megaservice
+  namespace: opea-chatqna
+spec:
+  selector:
+    matchLabels:
+      app: chatqna-megaservice
+  replicas: 1
+  template:
+    metadata:
+      labels:
+        app: chatqna-megaservice
+    spec:
+      serviceAccount: opea-chatqna
+      containers:
+        - name: chatqna-megaservice
+          image: 'image-registry.openshift-image-registry.svc:5000/opea-chatqna/chatqna-megaservice:latest'
+          env:
+            - name: EMBED_MODEL
+              value: BAAI/bge-base-en-v1.5
+            - name: HUGGINGFACEHUB_API_TOKEN
+              valueFrom:
+                secretKeyRef:
+                  key:  HUGGINGFACEHUB_API_TOKEN
+                  name: hf-token
+            - name: MODEL_SIZE
+              value: 70b
+            - name: PYTHONPATH
+              value: $PYTHONPATH:/ws:/home/user:/home/user/qna-app/app
+            - name: RAG_UPLOAD_DIR
+              value: /upload_dir
+            - name: REDIS_PORT
+              value: "6379"
+            - name: REDIS_HOST
+              value: "redis-vector-db"
+            - name: REDIS_SCHEMA
+              value: schema_dim_768.yml
+            - name: TGI_ENDPOINT
+              value: http://xxx.xxx.xxx.xxx:xxx
+          ports:
+            - containerPort: 8000
+          command:
+            - /bin/bash
+            - '-c'
+            - |
+              cd /ws && \
+              python ingest.py /ws/data_intel/ && \
+              cd /home/user/qna-app && \
+              python app/server.py
+          volumeMounts:
+          - mountPath: /ws
+            name: chatqna-megaservice-pvc-0
+          - mountPath: /test
+            name: chatqna-megaservice-pvc-1
+          - mountPath: /upload_dir
+            name: chatqna-megaservice-pvc-2
+      volumes:
+      - name: chatqna-megaservice-pvc-0
+        persistentVolumeClaim:
+          claimName: chatqna-megaservice-pvc-0
+      - name: chatqna-megaservice-pvc-1
+        persistentVolumeClaim:
+          claimName: chatqna-megaservice-pvc-1
+      - name: chatqna-megaservice-pvc-2
+        persistentVolumeClaim:
+          claimName: chatqna-megaservice-pvc-2
+---
+# Chatqna megaservice Service
+apiVersion: v1
+kind: Service
+metadata:
+  name: chatqna-megaservice
+  namespace: opea-chatqna
+spec:
+  type: ClusterIP
+  selector:
+    app: chatqna-megaservice
+  ports:
+  - port: 8000
+    targetPort: 8000
diff --git a/workloads/opea/chatqna/create_megaservice_container.sh b/workloads/opea/chatqna/create_megaservice_container.sh
@@ -0,0 +1,15 @@
+# Copyright (c) 2024 Intel Corporation
+# SPDX-License-Identifier: Apache-2.0
+#!/bin/sh
+
+tag="v0.8"
+namespace="opea-chatqna"
+repo="https://github.com/opea-project/GenAIExamples.git"
+yaml_url="https://raw.githubusercontent.com/intel/intel-technology-enabling-for-openshift/main/workloads/opea/chatqna/chatqna_megaservice_buildconfig.yaml"
+
+oc $namespace &&
+    git clone --depth 1 --branch $tag $repo && 
+        cd GenAIExamples/ChatQnA/deprecated/langchain/docker &&
+            oc extract secret/knative-serving-cert -n istio-system --to=. --keys=tls.crt &&
+                oc apply -f $yaml_url &&
+                    oc start-build chatqna-megaservice --from-dir=./ --follow
diff --git a/workloads/opea/chatqna/persistent_volumes.yaml b/workloads/opea/chatqna/persistent_volumes.yaml
@@ -0,0 +1,44 @@
+# Copyright (c) 2024 Intel Corporation
+# SPDX-License-Identifier: Apache-2.0
+---
+apiVersion: v1
+kind: PersistentVolume
+metadata:
+  name: chatqna-megaservice-pv-0
+spec:
+  capacity:
+    storage: 100Mi
+  accessModes:
+    - ReadWriteOnce
+  persistentVolumeReclaimPolicy: Retain
+  nfs:
+    server: x.x.x.x # nfs server
+    path: /nfs # nfs path
+---
+apiVersion: v1
+kind: PersistentVolume
+metadata:
+  name: chatqna-megaservice-pv-1
+spec:
+  capacity:
+    storage: 100Mi
+  accessModes:
+    - ReadWriteOnce
+  persistentVolumeReclaimPolicy: Retain
+  nfs:
+    server: x.x.x.x # nfs server
+    path: /nfs # nfs path
+---
+apiVersion: v1
+kind: PersistentVolume
+metadata:
+  name: chatqna-megaservice-pv-2
+spec:
+  capacity:
+    storage: 100Mi
+  accessModes:
+    - ReadWriteOnce
+  persistentVolumeReclaimPolicy: Retain
+  nfs:
+    server: x.x.x.x # nfs server
+    path: /nfs # nfs path
diff --git a/workloads/opea/chatqna/redis_deployment_service.yaml b/workloads/opea/chatqna/redis_deployment_service.yaml
diff --git a/workloads/opea/chatqna/tgi_gaudi_servingruntime.yaml b/workloads/opea/chatqna/tgi_gaudi_servingruntime.yaml