You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+11-13Lines changed: 11 additions & 13 deletions
Original file line number
Diff line number
Diff line change
@@ -13,8 +13,7 @@ This project delivers reference infrastructures powered by Intel AI hardware and
13
13
14
14
The recommended **Infrastructure Cluster** is built with [**Intel® scalable Gaudi® Accelerator**](https://docs.habana.ai/en/latest/Gaudi_Overview/Gaudi_Architecture.html#gaudi-architecture) and standard servers. The [Intel® Xeon® processors](https://www.intel.com/content/www/us/en/products/details/processors/xeon/xeon6-product-brief.html) are used in these Gaudi servers as worker nodes and in standard servers as highly available control plane nodes. This infrastructure is designed for **high availability**, **scalability**, and **efficiency** in **Retrieval-Augmented Generation (RAG) and other Large Language Model (LLM) inferencing** workloads.
15
15
16
-
The [**Gaudi embedded RDMA over Converged Ethernet (RoCE) network**](https://docs.habana.ai/en/latest/PyTorch/PyTorch_Scaling_Guide/Theory_of_Distributed_Training.html#theory-of-distributed-training), along with the [**Three Ply Gaudi RoCE Network topology**](https://docs.habana.ai/en/latest/Management_and_Monitoring/Network_Configuration/Configure_E2E_Test_in_L3.html#generating-a-gaudinet-json-example) supports high-throughput and low latency LLM Parallel Pre-training and Post-training workloads, such as Supervised Fine-Tuning (SFT) and Reinforcement Learning (RL). For more details, see: [LLM Post‐Training Solution with Intel Enterprise AI Foundation for OpenShift](https://github.com/intel/intel-technology-enabling-for-openshift/wiki/Fine-tunning-LLM-Models-with-Intel-Enterprise-AI-Foundation-on-OpenShift)
17
-
16
+
The [**Gaudi embedded RDMA over Converged Ethernet (RoCE) network**](https://docs.habana.ai/en/latest/PyTorch/PyTorch_Scaling_Guide/Theory_of_Distributed_Training.html#theory-of-distributed-training), along with the [**Three Ply Gaudi RoCE Network topology**](https://docs.habana.ai/en/latest/Management_and_Monitoring/Network_Configuration/Configure_E2E_Test_in_L3.html#generating-a-gaudinet-json-example) supports high-throughput and low latency LLM Parallel Pre-training and Post-training workloads, such as Supervised Fine-Tuning (SFT) and Reinforcement Learning (RL). For more details, see: [RoCE Networking‐Based LLM Post‐Training Solution with Intel Enterprise AI Foundation for OpenShift](https://github.com/intel/intel-technology-enabling-for-openshift/wiki/RoCE-Networking%E2%80%90Based-LLM-Post%E2%80%90Training-Solution-with-Intel-Enterprise-AI-Foundation-for-OpenShift)
18
17
This highly efficient infrastructure has been validated with cutting-edge enterprise AI workloads on the production-ready OpenShift platform, enabling users to easily evaluate and integrate it into their own AI environments.
19
18
20
19
Additionally, Intel SGX, DSA, and QAT accelerators (available with Xeon processors) are supported to further enhance performance and security for AI workloads.
@@ -41,11 +40,14 @@ Intel and Red Hat have coordinated for years to deliver a production-quality ope
41
40
42
41
The **Red Hat AI portfolio**, powered by **Intel AI technologies**, now includes:
43
42
*[**Red Hat AI Inference Server**](https://www.redhat.com/en/about/press-releases/red-hat-unlocks-generative-ai-any-model-and-any-accelerator-across-hybrid-cloud-red-hat-ai-inference-server) leverages the [LLM-d](https://github.com/llm-d/llm-d) and [vLLM](https://github.com/vllm-project/vllm) projects, integrating with Llama Stack, Model Context Protocol (MCP), and the Open AI API to deliver standardized APIs for developing and deploying [OPEA-based](https://github.com/opea-project) and other production-grade GenAI applications scalable across edge, enterprise and cloud environments.
44
-
***Red Hat OpenShift AI Distributed Training** provides pre-training, SFT and RL for major GenAI foundation models at scale. With seamless integration of the Kubeflow Training Operator, Intel Gaudi Computing and RoCE Networking technology, enterprises can unlock the full potential of cutting-edge GenAI technologies to drive innovation in their domains. See [LLM Post‐Training Solution with Intel Enterprise AI Foundation for OpenShift](https://github.com/intel/intel-technology-enabling-for-openshift/wiki/Fine-tunning-LLM-Models-with-Intel-Enterprise-AI-Foundation-on-OpenShift).
43
+
***Red Hat OpenShift AI Distributed Training** provides pre-training, SFT and RL for major GenAI foundation models at scale. With seamless integration of the Kubeflow Training Operator, Intel Gaudi Computing and RoCE Networking technology, enterprises can unlock the full potential of cutting-edge GenAI technologies to drive innovation in their domains. See [RoCE Networking‐Based LLM Post‐Training Solution with Intel Enterprise AI Foundation for OpenShift](https://github.com/intel/intel-technology-enabling-for-openshift/wiki/RoCE-Networking%E2%80%90Based-LLM-Post%E2%80%90Training-Solution-with-Intel-Enterprise-AI-Foundation-for-OpenShift).
45
44
* The operators to integrate [Intel Gaudi Software](https://docs.habana.ai/en/latest/index.html) or [OneAPI-based](https://www.intel.com/content/www/us/en/developer/tools/oneapi/overview.html#gs.kgdasr) AI software into OpenShift AI
46
45
47
-
## GenAI appplications and AI service
48
-
GenAI applications and AI servcies are built with [OPEA (Open Platform for Enterprise AI)](https://github.com/opea-project). And the integration of OPEA with LLM-d based Red Hat AI inference server is under investigation.
46
+
## GenAI applications and Inference Microservices
47
+
GenAI applications and Inference Microservcies are built with [OPEA (Open Platform for Enterprise AI)](https://github.com/opea-project). And the integration of OPEA with LLM-d based Red Hat AI inference server is under investigation.
48
+
49
+
## Enterprise AI Post-training workloads and end-to-end solution
50
+
See [**RoCE Networking‐Based LLM Post‐Training Solution with Intel Enterprise AI Foundation for OpenShift**](https://github.com/intel/intel-technology-enabling-for-openshift/wiki/RoCE-Networking%E2%80%90Based-LLM-Post%E2%80%90Training-Solution-with-Intel-Enterprise-AI-Foundation-for-OpenShift)
49
51
50
52
## Releases and Supported Platforms
51
53
Intel Enterprise AI foundation for OpenShift is released in alignment with the OpenShift release cadence. It is recommended to use the latest release.
@@ -55,7 +57,7 @@ For details on supported features and components, refer to the links below:
To review the release history, please visit the following link:
58
-
-[Intel Technology Enabling for OpenShift Release details](/docs/releases.rst)
60
+
-[Intel Enterprise AI Founcation for OpenShift Release details](/docs/releases.rst)
59
61
60
62
## Getting started
61
63
See reference [BIOS Configuration](/docs/supported_platforms.md#bios-configuration) required for each feature.
@@ -67,7 +69,7 @@ Use one of these two options to provision an RHOCP cluster:
67
69
68
70
In this project, we provisioned RHOCP 4.18 on a bare-metal multi-node cluster. For details about the supported RHOCP infrastructure, see the [Supported Platforms](/docs/supported_platforms.md) page.
69
71
70
-
### Provisioning Intel hardware features on RHOCP
72
+
### Provisioning Intel AI hardware features on RHOCP
71
73
If you are familiar with the steps mentioned below to provision the accelerators, you can use [One-Click](/one_click/README.md) solution as a reference to provision the accelerator automatically.
72
74
73
75
Follow [Setting up HabanaAI Operator](/gaudi/README.md) to provision Intel Gaudi AI accelerator.
@@ -83,14 +85,10 @@ You can use the instructions in the [link](/tests/l2/README.md) to verify the ha
83
85
84
86
## Upgrade (To be added)
85
87
86
-
## Reference end-to-end solution
87
-
The reference end-to-end solution is based on Intel hardware feature provisioning provided by this project.
88
-
89
-
[Intel AI Inferencing Solution](/e2e/inference/README.md) with [OpenVINO](https://github.com/openvinotoolkit/openvino) and [RHOAI](https://www.redhat.com/en/technologies/cloud-computing/openshift/openshift-data-science)
90
-
91
88
## Reference workloads
92
89
Here are the reference workloads built on the end-to-end solution and Intel hardware feature provisioning in this project.
0 commit comments