Skip to content

Commit c675783

Browse files
authored
Merge pull request #419 from intel/uMartinXu-patch-12
Update README.md
2 parents e5f30e2 + 1239dde commit c675783

File tree

1 file changed

+11
-13
lines changed

1 file changed

+11
-13
lines changed

README.md

Lines changed: 11 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -13,8 +13,7 @@ This project delivers reference infrastructures powered by Intel AI hardware and
1313

1414
The recommended **Infrastructure Cluster** is built with [**Intel® scalable Gaudi® Accelerator**](https://docs.habana.ai/en/latest/Gaudi_Overview/Gaudi_Architecture.html#gaudi-architecture) and standard servers. The [Intel® Xeon® processors](https://www.intel.com/content/www/us/en/products/details/processors/xeon/xeon6-product-brief.html ) are used in these Gaudi servers as worker nodes and in standard servers as highly available control plane nodes. This infrastructure is designed for **high availability**, **scalability**, and **efficiency** in **Retrieval-Augmented Generation (RAG) and other Large Language Model (LLM) inferencing** workloads.
1515

16-
The [**Gaudi embedded RDMA over Converged Ethernet (RoCE) network**](https://docs.habana.ai/en/latest/PyTorch/PyTorch_Scaling_Guide/Theory_of_Distributed_Training.html#theory-of-distributed-training), along with the [**Three Ply Gaudi RoCE Network topology**](https://docs.habana.ai/en/latest/Management_and_Monitoring/Network_Configuration/Configure_E2E_Test_in_L3.html#generating-a-gaudinet-json-example) supports high-throughput and low latency LLM Parallel Pre-training and Post-training workloads, such as Supervised Fine-Tuning (SFT) and Reinforcement Learning (RL). For more details, see: [LLM Post‐Training Solution with Intel Enterprise AI Foundation for OpenShift](https://github.com/intel/intel-technology-enabling-for-openshift/wiki/Fine-tunning-LLM-Models-with-Intel-Enterprise-AI-Foundation-on-OpenShift)
17-
16+
The [**Gaudi embedded RDMA over Converged Ethernet (RoCE) network**](https://docs.habana.ai/en/latest/PyTorch/PyTorch_Scaling_Guide/Theory_of_Distributed_Training.html#theory-of-distributed-training), along with the [**Three Ply Gaudi RoCE Network topology**](https://docs.habana.ai/en/latest/Management_and_Monitoring/Network_Configuration/Configure_E2E_Test_in_L3.html#generating-a-gaudinet-json-example) supports high-throughput and low latency LLM Parallel Pre-training and Post-training workloads, such as Supervised Fine-Tuning (SFT) and Reinforcement Learning (RL). For more details, see: [RoCE Networking‐Based LLM Post‐Training Solution with Intel Enterprise AI Foundation for OpenShift](https://github.com/intel/intel-technology-enabling-for-openshift/wiki/RoCE-Networking%E2%80%90Based-LLM-Post%E2%80%90Training-Solution-with-Intel-Enterprise-AI-Foundation-for-OpenShift)
1817
This highly efficient infrastructure has been validated with cutting-edge enterprise AI workloads on the production-ready OpenShift platform, enabling users to easily evaluate and integrate it into their own AI environments.
1918

2019
Additionally, Intel SGX, DSA, and QAT accelerators (available with Xeon processors) are supported to further enhance performance and security for AI workloads.
@@ -41,11 +40,14 @@ Intel and Red Hat have coordinated for years to deliver a production-quality ope
4140

4241
The **Red Hat AI portfolio**, powered by **Intel AI technologies**, now includes:
4342
* [**Red Hat AI Inference Server**](https://www.redhat.com/en/about/press-releases/red-hat-unlocks-generative-ai-any-model-and-any-accelerator-across-hybrid-cloud-red-hat-ai-inference-server) leverages the [LLM-d](https://github.com/llm-d/llm-d) and [vLLM](https://github.com/vllm-project/vllm) projects, integrating with Llama Stack, Model Context Protocol (MCP), and the Open AI API to deliver standardized APIs for developing and deploying [OPEA-based](https://github.com/opea-project) and other production-grade GenAI applications scalable across edge, enterprise and cloud environments.
44-
* **Red Hat OpenShift AI Distributed Training** provides pre-training, SFT and RL for major GenAI foundation models at scale. With seamless integration of the Kubeflow Training Operator, Intel Gaudi Computing and RoCE Networking technology, enterprises can unlock the full potential of cutting-edge GenAI technologies to drive innovation in their domains. See [LLM Post‐Training Solution with Intel Enterprise AI Foundation for OpenShift](https://github.com/intel/intel-technology-enabling-for-openshift/wiki/Fine-tunning-LLM-Models-with-Intel-Enterprise-AI-Foundation-on-OpenShift).
43+
* **Red Hat OpenShift AI Distributed Training** provides pre-training, SFT and RL for major GenAI foundation models at scale. With seamless integration of the Kubeflow Training Operator, Intel Gaudi Computing and RoCE Networking technology, enterprises can unlock the full potential of cutting-edge GenAI technologies to drive innovation in their domains. See [RoCE Networking‐Based LLM Post‐Training Solution with Intel Enterprise AI Foundation for OpenShift](https://github.com/intel/intel-technology-enabling-for-openshift/wiki/RoCE-Networking%E2%80%90Based-LLM-Post%E2%80%90Training-Solution-with-Intel-Enterprise-AI-Foundation-for-OpenShift).
4544
* The operators to integrate [Intel Gaudi Software](https://docs.habana.ai/en/latest/index.html) or [OneAPI-based](https://www.intel.com/content/www/us/en/developer/tools/oneapi/overview.html#gs.kgdasr) AI software into OpenShift AI
4645

47-
## GenAI appplications and AI service
48-
GenAI applications and AI servcies are built with [OPEA (Open Platform for Enterprise AI)](https://github.com/opea-project). And the integration of OPEA with LLM-d based Red Hat AI inference server is under investigation.
46+
## GenAI applications and Inference Microservices
47+
GenAI applications and Inference Microservcies are built with [OPEA (Open Platform for Enterprise AI)](https://github.com/opea-project). And the integration of OPEA with LLM-d based Red Hat AI inference server is under investigation.
48+
49+
## Enterprise AI Post-training workloads and end-to-end solution
50+
See [**RoCE Networking‐Based LLM Post‐Training Solution with Intel Enterprise AI Foundation for OpenShift**](https://github.com/intel/intel-technology-enabling-for-openshift/wiki/RoCE-Networking%E2%80%90Based-LLM-Post%E2%80%90Training-Solution-with-Intel-Enterprise-AI-Foundation-for-OpenShift)
4951

5052
## Releases and Supported Platforms
5153
Intel Enterprise AI foundation for OpenShift is released in alignment with the OpenShift release cadence. It is recommended to use the latest release.
@@ -55,7 +57,7 @@ For details on supported features and components, refer to the links below:
5557
- [Component Matrix](/docs/supported_platforms.md#component-matrix)
5658

5759
To review the release history, please visit the following link:
58-
- [Intel Technology Enabling for OpenShift Release details](/docs/releases.rst)
60+
- [Intel Enterprise AI Founcation for OpenShift Release details](/docs/releases.rst)
5961

6062
## Getting started
6163
See reference [BIOS Configuration](/docs/supported_platforms.md#bios-configuration) required for each feature.
@@ -67,7 +69,7 @@ Use one of these two options to provision an RHOCP cluster:
6769

6870
In this project, we provisioned RHOCP 4.18 on a bare-metal multi-node cluster. For details about the supported RHOCP infrastructure, see the [Supported Platforms](/docs/supported_platforms.md) page.
6971

70-
### Provisioning Intel hardware features on RHOCP
72+
### Provisioning Intel AI hardware features on RHOCP
7173
If you are familiar with the steps mentioned below to provision the accelerators, you can use [One-Click](/one_click/README.md) solution as a reference to provision the accelerator automatically.
7274

7375
Follow [Setting up HabanaAI Operator](/gaudi/README.md) to provision Intel Gaudi AI accelerator.
@@ -83,14 +85,10 @@ You can use the instructions in the [link](/tests/l2/README.md) to verify the ha
8385

8486
## Upgrade (To be added)
8587

86-
## Reference end-to-end solution
87-
The reference end-to-end solution is based on Intel hardware feature provisioning provided by this project.
88-
89-
[Intel AI Inferencing Solution](/e2e/inference/README.md) with [OpenVINO](https://github.com/openvinotoolkit/openvino) and [RHOAI](https://www.redhat.com/en/technologies/cloud-computing/openshift/openshift-data-science)
90-
9188
## Reference workloads
9289
Here are the reference workloads built on the end-to-end solution and Intel hardware feature provisioning in this project.
93-
- [OPEA Workloads](workloads/opea/chatqna/README.md)
90+
- [RAG based OPEA ChatQnA GenAI application](workloads/opea/chatqna/README.md)
91+
- [vLLM Inference workload](https://github.com/intel/intel-technology-enabling-for-openshift/tree/main/tests/gaudi/l2#vllm)
9492

9593
## Advanced Guide
9694
This section discusses architecture and other technical details that go beyond getting started.

0 commit comments

Comments
 (0)