From 48b1b9645c30406199925df03405beb01d494962 Mon Sep 17 00:00:00 2001 From: praveenkk123 Date: Thu, 15 May 2025 05:42:37 -0700 Subject: [PATCH 1/7] Update README.md --- README.md | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/README.md b/README.md index 8610d72..82a4eb3 100644 --- a/README.md +++ b/README.md @@ -22,8 +22,7 @@ To set up your AIPC for running with Intel iGPUs, follow these essential steps: 4. Install CMake ### Hardware -- Intel® Core™ Ultra Processor - Windows 11 - +- Intel® Core™ Ultra Processor - Windows 11. ### Software - Python From 1e8482acb3895d1d893a12a78f2235e9d6478fa3 Mon Sep 17 00:00:00 2001 From: Praveen Date: Thu, 15 May 2025 06:15:10 -0700 Subject: [PATCH 2/7] Updated Readme --- LLM/09_rag_langchain.ipynb | 7 ++- LLM/README_RAG.md | 115 +++++++++++++++++++++++++++++++++++++ 2 files changed, 120 insertions(+), 2 deletions(-) create mode 100644 LLM/README_RAG.md diff --git a/LLM/09_rag_langchain.ipynb b/LLM/09_rag_langchain.ipynb index 9e29140..346ecf1 100644 --- a/LLM/09_rag_langchain.ipynb +++ b/LLM/09_rag_langchain.ipynb @@ -82,9 +82,12 @@ "metadata": {}, "source": [ "## Setup llama.cpp python for Intel CPUs and GPUs\n", - "The llama.cpp SYCL backend is designed to support Intel GPU. Based on the cross-platform feature of SYCL.\n", "\n", - "We will setup Python environment and corresponding custom kernel for Jupyter Notebook, and we will install/build llama.cpp that will be used for the RAG Application.\n", + "In this notebook, we will set up the Python environment and configure a custom kernel for Jupyter Notebook. Additionally, we will install and build llamacpp-python for Intel GPUs, which will be utilized for the RAG application.\n", + "\n", + "For detailed setup instructions, please follow the [README.md](./README_RAG.md) file in the current directory.\n", + "\n", + "The steps below are provided for reference only. If you have already followed the instructions in the README and set up the environment correctly, you can skip these steps and proceed directly to the RAG section.\n", "\n", "### Step 1: Create and activate Python environment:\n", "\n", diff --git a/LLM/README_RAG.md b/LLM/README_RAG.md new file mode 100644 index 0000000..dfa9198 --- /dev/null +++ b/LLM/README_RAG.md @@ -0,0 +1,115 @@ +# Building a Retrieval-Augmented Generation (RAG) System on AI PCs + +This notebook demonstrates how to run LLM inference for a Retrieval-Augmented Generation (RAG) application locally on an AI PC. It is optimized for Intel® Core™ Ultra processors, utilizing the combined capabilities of the CPU, GPU, and NPU for efficient AI workloads. + +## Installing Prerequisites +### Windows: +The following software are to be installed prior to the setting up of Llamacpp-python SYCL backend +1. **GPU Drivers installation** + - Download and Install the GPU driver from Intel® Arc™ & Iris® Xe Graphics - Windows* [link](https://www.intel.com/content/www/us/en/download/785597/intel-arc-iris-xe-graphics-windows.html) + - (Optional) Download and Install the NPU driver from [here](https://www.intel.com/content/www/us/en/download/794734/intel-npu-driver-windows.html) + - For NPU, if the Neural processor is not available, Check the PCI device to update the driver. + Follow this document [NPU_Win_Release_Notes_v2540.pdf](https://downloadmirror.intel.com/825735/NPU_Win_Release_Notes_v2540.pdf) + + **IMPORTANT:** Reboot the system after the installation + +2. **CMake for windows**\ +Download and install the latest CMake for Windows from [here](https://cmake.org/download/) + +3. **Microsoft Visual Studio 2022 community version**\ +Download and install VS 2022 community from [here](https://visualstudio.microsoft.com/downloads/)\ +**IMPORTANT:** Please select "Desktop Development with C++" option while installing Visual studio + +4. **Git for Windows**\ +Download and install Git from [here](https://git-scm.com/downloads/win) + +5. **Intel oneAPI Base Toolkit for Windows**\ +Download and install Intel oneAPI Base Toolkit for Windows from [here](https://www.intel.com/content/www/us/en/developer/tools/oneapi/base-toolkit-download.html?operatingsystem=windows&windows-install-type=offline) +**Note: Its important you need to download the 2025.0.1 version(older)of the oneAPI Basekit as llamacpp python is not yet compatible with latest oneAPI Basekit. +Select the "Choose a Version" dropdown and select 2025.01 + +7. **Miniconda for Windows**\ +Download and install Miniconda from [here](https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-Windows-x86_64.exe) + +### Linux: + +1. **GPU Drivers installation**\ +Download and install the GPU drivers from [here](https://dgpu-docs.intel.com/driver/client/overview.html) + +2. **Miniconda for Linux**\ +Download, install the Miniconda using the below commands. + ``` + wget "https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-$(uname)-$(uname -m).sh" + bash Miniforge3-$(uname)-$(uname -m).sh + ``` + Replace with your actual Miniforge bin folder path and run the cd command to go there. Initialize the conda environment and restart the terminal. + ``` + cd + ``` + ``` + ./conda init + ``` + +3. **Intel oneAPI Base Toolkit for Linux**\ +Download and install Intel oneAPI Base Toolkit for Linux from [here](https://www.intel.com/content/www/us/en/developer/tools/oneapi/base-toolkit-download.html?packages=oneapi-toolkit&oneapi-toolkit-os=linux&oneapi-lin=offline) + +4. **CMake and Git for Linux**\ +Install the CMake using below commands: + - For Debian/Ubuntu-based systems: + ``` + sudo apt update && sudo apt -y install cmake git + ``` + - For RHEL/CentOS-based systems: + ``` + sudo dnf update && sudo dnf -y install cmake git + ``` + +## Setting up environment and LlamaCPP-python GPU backend + +Open a new Mini-forge terminal and perform the following steps: + +1. **Create and activate the conda environment** + ``` + conda create -n llamacpp python=3.11 -y + conda activate llamacpp + ``` +2. **Initialize oneAPI environment**\ + On Windows: + ``` + @call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat" intel64 --force + ``` + On Linux: + ``` + source /opt/intel/oneapi/setvars.sh --force + ``` +3. **Set the environment variables and install Llamacpp-Python bindings**\ + On Windows: + ``` + set CMAKE_GENERATOR=Ninja + set CMAKE_C_COMPILER=cl + set CMAKE_CXX_COMPILER=icx + set CXX=icx + set CC=cl + set CMAKE_ARGS="-DGGML_SYCL=ON -DGGML_SYCL_F16=ON -DCMAKE_CXX_COMPILER=icx -DCMAKE_C_COMPILER=cl" + pip install llama-cpp-python==0.3.8 -U --force --no-cache-dir --verbose + ``` + On Linux: + ``` + CMAKE_ARGS="-DGGML_SYCL=on -DCMAKE_C_COMPILER=icx -DCMAKE_CXX_COMPILER=icpx" pip install llama-cpp-python==0.3.8 -U --force --no-cache-dir --verbose + ``` +4. **Install the required pip packages** + ``` + pip install -r rag/requirements.txt + ``` +5. **Install a ipykernel to select the gpu_llmsycl environment** + ``` + python -m ipykernel install --user --name=gpu_llmsycl + ``` + + +6. **Launch the Jupyter notebook using the below command** + ``` + jupyter lab + ``` + - Open the [LLM](./09_rag_langchain.ipynb) in the jupyter notebook, select the llamacpp kernel and run the code cells one by one in the notebook. + From 495f39752de3f74de271e3557b5c5bedf5b01ab4 Mon Sep 17 00:00:00 2001 From: praveenkk123 Date: Thu, 15 May 2025 06:17:45 -0700 Subject: [PATCH 3/7] Update README_RAG.md --- LLM/README_RAG.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/LLM/README_RAG.md b/LLM/README_RAG.md index dfa9198..d278250 100644 --- a/LLM/README_RAG.md +++ b/LLM/README_RAG.md @@ -103,7 +103,7 @@ Open a new Mini-forge terminal and perform the following steps: ``` 5. **Install a ipykernel to select the gpu_llmsycl environment** ``` - python -m ipykernel install --user --name=gpu_llmsycl + python -m ipykernel install --user --name=llamacpp ``` From 6c45031a22052e739a07ed4c8ab11b6599dd4621 Mon Sep 17 00:00:00 2001 From: praveenkk123 Date: Thu, 15 May 2025 06:22:43 -0700 Subject: [PATCH 4/7] Update README_RAG.md --- LLM/README_RAG.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/LLM/README_RAG.md b/LLM/README_RAG.md index d278250..fd9cc89 100644 --- a/LLM/README_RAG.md +++ b/LLM/README_RAG.md @@ -25,8 +25,9 @@ Download and install Git from [here](https://git-scm.com/downloads/win) 5. **Intel oneAPI Base Toolkit for Windows**\ Download and install Intel oneAPI Base Toolkit for Windows from [here](https://www.intel.com/content/www/us/en/developer/tools/oneapi/base-toolkit-download.html?operatingsystem=windows&windows-install-type=offline) + **Note: Its important you need to download the 2025.0.1 version(older)of the oneAPI Basekit as llamacpp python is not yet compatible with latest oneAPI Basekit. -Select the "Choose a Version" dropdown and select 2025.01 +When downloading the installer please select the "Choose a Version" dropdown and select 2025.0.1** 7. **Miniconda for Windows**\ Download and install Miniconda from [here](https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-Windows-x86_64.exe) From 9dc37c6c8c03dba960e4488717d58dcb6a287775 Mon Sep 17 00:00:00 2001 From: Praveen Date: Thu, 15 May 2025 08:00:11 -0700 Subject: [PATCH 5/7] Updated Readme --- LLM/09_rag_langchain.ipynb | 205 ++++++++++++++++++++----------------- LLM/README_RAG.md | 2 +- 2 files changed, 114 insertions(+), 93 deletions(-) diff --git a/LLM/09_rag_langchain.ipynb b/LLM/09_rag_langchain.ipynb index 346ecf1..859d220 100644 --- a/LLM/09_rag_langchain.ipynb +++ b/LLM/09_rag_langchain.ipynb @@ -85,98 +85,119 @@ "\n", "In this notebook, we will set up the Python environment and configure a custom kernel for Jupyter Notebook. Additionally, we will install and build llamacpp-python for Intel GPUs, which will be utilized for the RAG application.\n", "\n", - "For detailed setup instructions, please follow the [README.md](./README_RAG.md) file in the current directory.\n", - "\n", - "The steps below are provided for reference only. If you have already followed the instructions in the README and set up the environment correctly, you can skip these steps and proceed directly to the RAG section.\n", - "\n", - "### Step 1: Create and activate Python environment:\n", - "\n", - "Open Terminal, make sure mini-forge is install and create new virtual environment\n", - "\n", - "```\n", - " conda create -n llm-sycl python=3.11\n", - "\n", - " conda activate llm-sycl\n", - "\n", - "```\n", - "_Note: In case you want to remove the virtual environment, run the following command:_\n", - "```\n", - " [conda remove -n llm-sycl --all]\n", - "```\n", - "\n", - "### Step 2: Setup a custom kernel for Jupyter notebook:\n", - "\n", - "Run the following commands in the terminal to setup custom kernel for the Jupyter Notebook.\n", - "\n", - "```\n", - " conda install -c conda-forge ipykernel\n", - "\n", - " python -m ipykernel install --user --name=llm-sycl\n", - "```\n", - "_Note: In case you want to remove the custom kernel from Jupyter, run the following command:_\n", - "```\n", - " [python -m jupyter kernelspec uninstall llm-sycl]\n", - "```\n", - "\n", - "\n", - "\n", - "### Step 3: Install and Build llama.cpp\n", - "\n", - "### For Linux\n", - "\n", - "#### 1. Enable oneAPI environment\n", - "\n", - "Make sure oneAPI Base Toolkit is installed to use the SYCL compiler for building llama.cpp\n", - "\n", - "Run the following commands in terminal to initialize oneAPI environment and check available devices:\n", - "\n", - "```\n", - " source /opt/intel/oneapi/setvars.sh\n", - " sycl-ls\n", - "```\n", - "\n", - "#### 2. Install and build llama.cpp Python\n", - "\n", - "Run the following commands in terminal to install and build llama.cpp\n", - "\n", - "```\n", - " CMAKE_ARGS=\"-DGGML_SYCL=on -DCMAKE_C_COMPILER=icx -DCMAKE_CXX_COMPILER=icpx\" pip install llama-cpp-python==0.3.1\n", - "```\n", - "\n", - "### For Windows\n", - "\n", - "#### 1. Enable oneAPI environment\n", - "\n", - "Make sure oneAPI Base Toolkit is installed to use the SYCL compiler for building llama.cpp\n", - "\n", - "Type oneapi in the windows search and then open the Intel oneAPI command prompt for Intel 64 for Visual Studio 2022 App.\n", - "\n", - "Run the following commands to initialize oneAPI environment and check available devices:\n", - "\n", - "```\n", - " @call \"C:\\Program Files (x86)\\Intel\\oneAPI\\setvars.bat\" intel64 --force\n", - " sycl-ls\n", - "```\n", - "\n", - "#### 2. Install build tools\n", - "\n", - "* Download & install [cmake for Windows](https://cmake.org/download/):\n", - "* The new Visual Studio will install Ninja as default. (If not, please install it manually: https://ninja-build.org/)\n", - "\n", - "#### 3. Install and build llama.cpp Python\n", - "\n", - "* On the oneAPI command line window, step into the llama.cpp main directory and run the following:\n", - " \n", - "```\n", - " set CMAKE_GENERATOR=Ninja\n", - " set CMAKE_C_COMPILER=cl\n", - " set CMAKE_CXX_COMPILER=icx\n", - " set CXX=icx\n", - " set CC=cl\n", - " set CMAKE_ARGS=\"-DGGML_SYCL=ON -DGGML_SYCL_F16=ON -DCMAKE_CXX_COMPILER=icx -DCMAKE_C_COMPILER=cl\"\n", - " \n", - " pip install llama-cpp-python==0.3.8 -U --force --no-cache-dir --verbose\n", - "```\n" + "For detailed setup instructions, please follow the [View README](./README_RAG.md) file in the current directory OR follow the instructions below.\n", + "\n", + "## Installing Prerequisites\n", + "### Windows:\n", + "The following software are to be installed prior to the setting up of Llamacpp-python SYCL backend\n", + "1. **GPU Drivers installation**\n", + " - Download and Install the GPU driver from Intel® Arc™ & Iris® Xe Graphics - Windows* [link](https://www.intel.com/content/www/us/en/download/785597/intel-arc-iris-xe-graphics-windows.html)\n", + " - (Optional) Download and Install the NPU driver from [here](https://www.intel.com/content/www/us/en/download/794734/intel-npu-driver-windows.html)\n", + " - For NPU, if the Neural processor is not available, Check the PCI device to update the driver.\n", + " Follow this document [NPU_Win_Release_Notes_v2540.pdf](https://downloadmirror.intel.com/825735/NPU_Win_Release_Notes_v2540.pdf)\n", + "\n", + " **IMPORTANT:** Reboot the system after the installation\n", + "\n", + "2. **CMake for windows**\\\n", + "Download and install the latest CMake for Windows from [here](https://cmake.org/download/)\n", + "\n", + "3. **Microsoft Visual Studio 2022 community version**\\\n", + "Download and install VS 2022 community from [here](https://visualstudio.microsoft.com/downloads/)\\\n", + "**IMPORTANT:** Please select \"Desktop Development with C++\" option while installing Visual studio\n", + "\n", + "4. **Git for Windows**\\\n", + "Download and install Git from [here](https://git-scm.com/downloads/win)\n", + "\n", + "5. **Intel oneAPI Base Toolkit for Windows**\\\n", + "Download and install Intel oneAPI Base Toolkit for Windows from [here](https://www.intel.com/content/www/us/en/developer/tools/oneapi/base-toolkit-download.html?operatingsystem=windows&windows-install-type=offline)\n", + "\n", + "**Note: Its important you need to download the 2025.0.1 version(older)of the oneAPI Basekit as llamacpp python is not yet compatible with latest oneAPI Basekit.\n", + "When downloading the installer please select the \"Choose a Version\" dropdown and select 2025.0.1**\n", + "\n", + "7. **Miniconda for Windows**\\\n", + "Download and install Miniconda from [here](https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-Windows-x86_64.exe)\n", + "\n", + "### Linux:\n", + "\n", + "1. **GPU Drivers installation**\\\n", + "Download and install the GPU drivers from [here](https://dgpu-docs.intel.com/driver/client/overview.html)\n", + "\n", + "2. **Miniconda for Linux**\\\n", + "Download, install the Miniconda using the below commands. \n", + " ```\n", + " wget \"https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-$(uname)-$(uname -m).sh\"\n", + " bash Miniforge3-$(uname)-$(uname -m).sh\n", + " ```\n", + " Replace with your actual Miniforge bin folder path and run the cd command to go there. Initialize the conda environment and restart the terminal.\n", + " ```\n", + " cd \n", + " ```\n", + " ``` \n", + " ./conda init \n", + " ```\n", + "\n", + "3. **Intel oneAPI Base Toolkit for Linux**\\\n", + "Download and install Intel oneAPI Base Toolkit for Linux from [here](https://www.intel.com/content/www/us/en/developer/tools/oneapi/base-toolkit-download.html?packages=oneapi-toolkit&oneapi-toolkit-os=linux&oneapi-lin=offline)\n", + "\n", + "4. **CMake and Git for Linux**\\\n", + "Install the CMake using below commands:\n", + " - For Debian/Ubuntu-based systems:\n", + " ```\n", + " sudo apt update && sudo apt -y install cmake git\n", + " ```\n", + " - For RHEL/CentOS-based systems:\n", + " ```\n", + " sudo dnf update && sudo dnf -y install cmake git\n", + " ```\n", + " \n", + "## Setting up environment and LlamaCPP-python GPU backend\n", + "\n", + "Open a new Mini-forge terminal and perform the following steps:\n", + "\n", + "1. **Create and activate the conda environment**\n", + " ```\n", + " conda create -n llamacpp python=3.11 -y\n", + " conda activate llamacpp\n", + " ```\n", + "2. **Initialize oneAPI environment**\\\n", + " On Windows:\n", + " ```\n", + " @call \"C:\\Program Files (x86)\\Intel\\oneAPI\\setvars.bat\" intel64 --force\n", + " ```\n", + " On Linux:\n", + " ```\n", + " source /opt/intel/oneapi/setvars.sh --force\n", + " ```\n", + "3. **Set the environment variables and install Llamacpp-Python bindings**\\\n", + " On Windows:\n", + " ```\n", + " set CMAKE_GENERATOR=Ninja\n", + " set CMAKE_C_COMPILER=cl\n", + " set CMAKE_CXX_COMPILER=icx\n", + " set CXX=icx\n", + " set CC=cl\n", + " set CMAKE_ARGS=\"-DGGML_SYCL=ON -DGGML_SYCL_F16=ON -DCMAKE_CXX_COMPILER=icx -DCMAKE_C_COMPILER=cl\"\n", + " pip install llama-cpp-python==0.3.8 -U --force --no-cache-dir --verbose\n", + " ```\n", + " On Linux:\n", + " ```\n", + " CMAKE_ARGS=\"-DGGML_SYCL=on -DCMAKE_C_COMPILER=icx -DCMAKE_CXX_COMPILER=icpx\" pip install llama-cpp-python==0.3.8 -U --force --no-cache-dir --verbose\n", + " ```\n", + "4. **Install the required pip packages**\n", + " ```\n", + " pip install -r rag/requirements.txt\n", + " ```\n", + "5. **Install a ipykernel to select the llamacpp environment**\n", + " ```\n", + " python -m ipykernel install --user --name=llamacpp\n", + " ```\n", + "\n", + "\n", + "6. **Launch the Jupyter notebook using the below command**\n", + " ```\n", + " jupyter lab\n", + " ```\n", + " - Open the [LLM](./09_rag_langchain.ipynb) in the jupyter notebook, select the llamacpp kernel and run the code cells one by one in the notebook.\n" ] }, { diff --git a/LLM/README_RAG.md b/LLM/README_RAG.md index fd9cc89..f7dfe47 100644 --- a/LLM/README_RAG.md +++ b/LLM/README_RAG.md @@ -102,7 +102,7 @@ Open a new Mini-forge terminal and perform the following steps: ``` pip install -r rag/requirements.txt ``` -5. **Install a ipykernel to select the gpu_llmsycl environment** +5. **Install a ipykernel to select the llamacpp environment** ``` python -m ipykernel install --user --name=llamacpp ``` From 7b08c8155992edf8af0aac907c0bb80a4214e472 Mon Sep 17 00:00:00 2001 From: praveenkk123 Date: Thu, 15 May 2025 08:01:58 -0700 Subject: [PATCH 6/7] Update README_RAG.md --- LLM/README_RAG.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/LLM/README_RAG.md b/LLM/README_RAG.md index f7dfe47..370ff18 100644 --- a/LLM/README_RAG.md +++ b/LLM/README_RAG.md @@ -112,5 +112,5 @@ Open a new Mini-forge terminal and perform the following steps: ``` jupyter lab ``` - - Open the [LLM](./09_rag_langchain.ipynb) in the jupyter notebook, select the llamacpp kernel and run the code cells one by one in the notebook. + - Open the [Notebook](./09_rag_langchain.ipynb), select the llamacpp kernel and run the code cells one by one in the notebook. From 2ab77856bdce0c8c4a792cdff1255e0ed1b71cca Mon Sep 17 00:00:00 2001 From: praveenkk123 Date: Thu, 15 May 2025 08:04:50 -0700 Subject: [PATCH 7/7] Update 09_rag_langchain.ipynb --- LLM/09_rag_langchain.ipynb | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/LLM/09_rag_langchain.ipynb b/LLM/09_rag_langchain.ipynb index 859d220..0a9283f 100644 --- a/LLM/09_rag_langchain.ipynb +++ b/LLM/09_rag_langchain.ipynb @@ -197,7 +197,7 @@ " ```\n", " jupyter lab\n", " ```\n", - " - Open the [LLM](./09_rag_langchain.ipynb) in the jupyter notebook, select the llamacpp kernel and run the code cells one by one in the notebook.\n" + " - Open the 09_rag_langchain.ipynb in the jupyter notebook, select the llamacpp kernel and run the code cells one by one in the notebook.\n" ] }, {