Merge pull request #419 from intel/uMartinXu-patch-12

vbedida79 · web-flow · commit c67578330a28 · 2025-07-14T16:20:29.000-07:00
Update README.md
diff --git a/README.md b/README.md
@@ -13,8 +13,7 @@ This project delivers reference infrastructures powered by Intel AI hardware and
 
 The recommended **Infrastructure Cluster** is built with [**Intel® scalable Gaudi® Accelerator**](https://docs.habana.ai/en/latest/Gaudi_Overview/Gaudi_Architecture.html#gaudi-architecture) and standard servers. The [Intel® Xeon® processors](https://www.intel.com/content/www/us/en/products/details/processors/xeon/xeon6-product-brief.html ) are used in these Gaudi servers as worker nodes and in standard servers as highly available control plane nodes. This infrastructure is designed for **high availability**, **scalability**, and **efficiency** in **Retrieval-Augmented Generation (RAG) and other Large Language Model (LLM) inferencing** workloads.
 
-The [**Gaudi embedded RDMA over Converged Ethernet (RoCE) network**](https://docs.habana.ai/en/latest/PyTorch/PyTorch_Scaling_Guide/Theory_of_Distributed_Training.html#theory-of-distributed-training), along with the [**Three Ply Gaudi RoCE Network topology**](https://docs.habana.ai/en/latest/Management_and_Monitoring/Network_Configuration/Configure_E2E_Test_in_L3.html#generating-a-gaudinet-json-example) supports high-throughput and low latency LLM Parallel Pre-training and Post-training workloads, such as Supervised Fine-Tuning (SFT) and Reinforcement Learning (RL). For more details, see: [LLM Post‐Training Solution with Intel Enterprise AI Foundation for OpenShift](https://github.com/intel/intel-technology-enabling-for-openshift/wiki/Fine-tunning-LLM-Models-with-Intel-Enterprise-AI-Foundation-on-OpenShift)
-
+The [**Gaudi embedded RDMA over Converged Ethernet (RoCE) network**](https://docs.habana.ai/en/latest/PyTorch/PyTorch_Scaling_Guide/Theory_of_Distributed_Training.html#theory-of-distributed-training), along with the [**Three Ply Gaudi RoCE Network topology**](https://docs.habana.ai/en/latest/Management_and_Monitoring/Network_Configuration/Configure_E2E_Test_in_L3.html#generating-a-gaudinet-json-example) supports high-throughput and low latency LLM Parallel Pre-training and Post-training workloads, such as Supervised Fine-Tuning (SFT) and Reinforcement Learning (RL). For more details, see:  [RoCE Networking‐Based LLM Post‐Training Solution with Intel Enterprise AI Foundation for OpenShift](https://github.com/intel/intel-technology-enabling-for-openshift/wiki/RoCE-Networking%E2%80%90Based-LLM-Post%E2%80%90Training-Solution-with-Intel-Enterprise-AI-Foundation-for-OpenShift)
 This highly efficient infrastructure has been validated with cutting-edge enterprise AI workloads on the production-ready OpenShift platform, enabling users to easily evaluate and integrate it into their own AI environments.
 
 Additionally, Intel SGX, DSA, and QAT accelerators (available with Xeon processors) are supported to further enhance performance and security for AI workloads.
@@ -41,11 +40,14 @@ Intel and Red Hat have coordinated for years to deliver a production-quality ope
 
 The **Red Hat AI portfolio**, powered by **Intel AI technologies**, now includes:
 * [**Red Hat AI Inference Server**](https://www.redhat.com/en/about/press-releases/red-hat-unlocks-generative-ai-any-model-and-any-accelerator-across-hybrid-cloud-red-hat-ai-inference-server) leverages the [LLM-d](https://github.com/llm-d/llm-d) and [vLLM](https://github.com/vllm-project/vllm) projects, integrating with Llama Stack, Model Context Protocol (MCP), and the Open AI API to deliver standardized APIs for developing and deploying [OPEA-based](https://github.com/opea-project) and other production-grade GenAI applications scalable across edge, enterprise and cloud environments.
-* **Red Hat OpenShift AI Distributed Training** provides pre-training, SFT and RL for major GenAI foundation models at scale. With seamless integration of the Kubeflow Training Operator, Intel Gaudi Computing and RoCE Networking technology, enterprises can unlock the full potential of cutting-edge GenAI technologies to drive innovation in their domains.  See [LLM Post‐Training Solution with Intel Enterprise AI Foundation for OpenShift](https://github.com/intel/intel-technology-enabling-for-openshift/wiki/Fine-tunning-LLM-Models-with-Intel-Enterprise-AI-Foundation-on-OpenShift).
+* **Red Hat OpenShift AI Distributed Training** provides pre-training, SFT and RL for major GenAI foundation models at scale. With seamless integration of the Kubeflow Training Operator, Intel Gaudi Computing and RoCE Networking technology, enterprises can unlock the full potential of cutting-edge GenAI technologies to drive innovation in their domains.  See [RoCE Networking‐Based LLM Post‐Training Solution with Intel Enterprise AI Foundation for OpenShift](https://github.com/intel/intel-technology-enabling-for-openshift/wiki/RoCE-Networking%E2%80%90Based-LLM-Post%E2%80%90Training-Solution-with-Intel-Enterprise-AI-Foundation-for-OpenShift).
 * The operators to integrate [Intel Gaudi Software](https://docs.habana.ai/en/latest/index.html) or [OneAPI-based](https://www.intel.com/content/www/us/en/developer/tools/oneapi/overview.html#gs.kgdasr) AI software into OpenShift AI
 
-## GenAI appplications and AI service
-GenAI applications and AI servcies are built with [OPEA (Open Platform for Enterprise AI)](https://github.com/opea-project). And the integration of OPEA with LLM-d based Red Hat AI inference server is under investigation. 
+## GenAI applications and Inference Microservices
+GenAI applications and Inference Microservcies are built with [OPEA (Open Platform for Enterprise AI)](https://github.com/opea-project). And the integration of OPEA with LLM-d based Red Hat AI inference server is under investigation. 
+
+## Enterprise AI Post-training workloads and end-to-end solution 
+ See [**RoCE Networking‐Based LLM Post‐Training Solution with Intel Enterprise AI Foundation for OpenShift**](https://github.com/intel/intel-technology-enabling-for-openshift/wiki/RoCE-Networking%E2%80%90Based-LLM-Post%E2%80%90Training-Solution-with-Intel-Enterprise-AI-Foundation-for-OpenShift)
 
 ## Releases and Supported Platforms 
 Intel Enterprise AI foundation for OpenShift is released in alignment with the OpenShift release cadence. It is recommended to use the latest release. 
@@ -55,7 +57,7 @@ For details on supported features and components, refer to the links below:
 - [Component Matrix](/docs/supported_platforms.md#component-matrix) 
 
 To review the release history, please visit the following link: 
-- [Intel Technology Enabling for OpenShift Release details](/docs/releases.rst)
+- [Intel Enterprise AI Founcation for OpenShift Release details](/docs/releases.rst)
 
 ## Getting started
 See reference [BIOS Configuration](/docs/supported_platforms.md#bios-configuration) required for each feature.
@@ -67,7 +69,7 @@ Use one of these two options to provision an RHOCP cluster:
 
 In this project, we provisioned RHOCP 4.18 on a bare-metal multi-node cluster. For details about the supported RHOCP infrastructure, see the [Supported Platforms](/docs/supported_platforms.md) page.
 
-### Provisioning Intel hardware features on RHOCP
+### Provisioning Intel AI hardware features on RHOCP
 If you are familiar with the steps mentioned below to provision the accelerators, you can use [One-Click](/one_click/README.md) solution as a reference to provision the accelerator automatically.
 
 Follow [Setting up HabanaAI Operator](/gaudi/README.md) to provision Intel Gaudi AI accelerator.  
@@ -83,14 +85,10 @@ You can use the instructions in the [link](/tests/l2/README.md) to verify the ha
 
 ## Upgrade (To be added) 
 
-## Reference end-to-end solution 
-The reference end-to-end solution is based on Intel hardware feature provisioning provided by this project. 
-
-[Intel AI Inferencing Solution](/e2e/inference/README.md) with [OpenVINO](https://github.com/openvinotoolkit/openvino) and [RHOAI](https://www.redhat.com/en/technologies/cloud-computing/openshift/openshift-data-science) 
-
 ## Reference workloads 
 Here are the reference workloads built on the end-to-end solution and Intel hardware feature provisioning in this project. 
-- [OPEA Workloads](workloads/opea/chatqna/README.md)
+- [RAG based OPEA ChatQnA GenAI application](workloads/opea/chatqna/README.md)
+- [vLLM Inference workload](https://github.com/intel/intel-technology-enabling-for-openshift/tree/main/tests/gaudi/l2#vllm)
 
 ## Advanced Guide 
 This section discusses architecture and other technical details that go beyond getting started.