diff --git a/LLM/09_rag_langchain.ipynb b/LLM/09_rag_langchain.ipynb index 9e29140..0a9283f 100644 --- a/LLM/09_rag_langchain.ipynb +++ b/LLM/09_rag_langchain.ipynb @@ -82,98 +82,122 @@ "metadata": {}, "source": [ "## Setup llama.cpp python for Intel CPUs and GPUs\n", - "The llama.cpp SYCL backend is designed to support Intel GPU. Based on the cross-platform feature of SYCL.\n", "\n", - "We will setup Python environment and corresponding custom kernel for Jupyter Notebook, and we will install/build llama.cpp that will be used for the RAG Application.\n", - "\n", - "### Step 1: Create and activate Python environment:\n", - "\n", - "Open Terminal, make sure mini-forge is install and create new virtual environment\n", - "\n", - "```\n", - " conda create -n llm-sycl python=3.11\n", - "\n", - " conda activate llm-sycl\n", - "\n", - "```\n", - "_Note: In case you want to remove the virtual environment, run the following command:_\n", - "```\n", - " [conda remove -n llm-sycl --all]\n", - "```\n", - "\n", - "### Step 2: Setup a custom kernel for Jupyter notebook:\n", - "\n", - "Run the following commands in the terminal to setup custom kernel for the Jupyter Notebook.\n", - "\n", - "```\n", - " conda install -c conda-forge ipykernel\n", - "\n", - " python -m ipykernel install --user --name=llm-sycl\n", - "```\n", - "_Note: In case you want to remove the custom kernel from Jupyter, run the following command:_\n", - "```\n", - " [python -m jupyter kernelspec uninstall llm-sycl]\n", - "```\n", - "\n", - "\n", - "\n", - "### Step 3: Install and Build llama.cpp\n", - "\n", - "### For Linux\n", - "\n", - "#### 1. Enable oneAPI environment\n", - "\n", - "Make sure oneAPI Base Toolkit is installed to use the SYCL compiler for building llama.cpp\n", - "\n", - "Run the following commands in terminal to initialize oneAPI environment and check available devices:\n", - "\n", - "```\n", - " source /opt/intel/oneapi/setvars.sh\n", - " sycl-ls\n", - "```\n", - "\n", - "#### 2. Install and build llama.cpp Python\n", - "\n", - "Run the following commands in terminal to install and build llama.cpp\n", - "\n", - "```\n", - " CMAKE_ARGS=\"-DGGML_SYCL=on -DCMAKE_C_COMPILER=icx -DCMAKE_CXX_COMPILER=icpx\" pip install llama-cpp-python==0.3.1\n", - "```\n", - "\n", - "### For Windows\n", - "\n", - "#### 1. Enable oneAPI environment\n", - "\n", - "Make sure oneAPI Base Toolkit is installed to use the SYCL compiler for building llama.cpp\n", - "\n", - "Type oneapi in the windows search and then open the Intel oneAPI command prompt for Intel 64 for Visual Studio 2022 App.\n", - "\n", - "Run the following commands to initialize oneAPI environment and check available devices:\n", - "\n", - "```\n", - " @call \"C:\\Program Files (x86)\\Intel\\oneAPI\\setvars.bat\" intel64 --force\n", - " sycl-ls\n", - "```\n", - "\n", - "#### 2. Install build tools\n", - "\n", - "* Download & install [cmake for Windows](https://cmake.org/download/):\n", - "* The new Visual Studio will install Ninja as default. (If not, please install it manually: https://ninja-build.org/)\n", - "\n", - "#### 3. Install and build llama.cpp Python\n", - "\n", - "* On the oneAPI command line window, step into the llama.cpp main directory and run the following:\n", - " \n", - "```\n", - " set CMAKE_GENERATOR=Ninja\n", - " set CMAKE_C_COMPILER=cl\n", - " set CMAKE_CXX_COMPILER=icx\n", - " set CXX=icx\n", - " set CC=cl\n", - " set CMAKE_ARGS=\"-DGGML_SYCL=ON -DGGML_SYCL_F16=ON -DCMAKE_CXX_COMPILER=icx -DCMAKE_C_COMPILER=cl\"\n", - " \n", - " pip install llama-cpp-python==0.3.8 -U --force --no-cache-dir --verbose\n", - "```\n" + "In this notebook, we will set up the Python environment and configure a custom kernel for Jupyter Notebook. Additionally, we will install and build llamacpp-python for Intel GPUs, which will be utilized for the RAG application.\n", + "\n", + "For detailed setup instructions, please follow the [View README](./README_RAG.md) file in the current directory OR follow the instructions below.\n", + "\n", + "## Installing Prerequisites\n", + "### Windows:\n", + "The following software are to be installed prior to the setting up of Llamacpp-python SYCL backend\n", + "1. **GPU Drivers installation**\n", + " - Download and Install the GPU driver from Intel® Arc™ & Iris® Xe Graphics - Windows* [link](https://www.intel.com/content/www/us/en/download/785597/intel-arc-iris-xe-graphics-windows.html)\n", + " - (Optional) Download and Install the NPU driver from [here](https://www.intel.com/content/www/us/en/download/794734/intel-npu-driver-windows.html)\n", + " - For NPU, if the Neural processor is not available, Check the PCI device to update the driver.\n", + " Follow this document [NPU_Win_Release_Notes_v2540.pdf](https://downloadmirror.intel.com/825735/NPU_Win_Release_Notes_v2540.pdf)\n", + "\n", + " **IMPORTANT:** Reboot the system after the installation\n", + "\n", + "2. **CMake for windows**\\\n", + "Download and install the latest CMake for Windows from [here](https://cmake.org/download/)\n", + "\n", + "3. **Microsoft Visual Studio 2022 community version**\\\n", + "Download and install VS 2022 community from [here](https://visualstudio.microsoft.com/downloads/)\\\n", + "**IMPORTANT:** Please select \"Desktop Development with C++\" option while installing Visual studio\n", + "\n", + "4. **Git for Windows**\\\n", + "Download and install Git from [here](https://git-scm.com/downloads/win)\n", + "\n", + "5. **Intel oneAPI Base Toolkit for Windows**\\\n", + "Download and install Intel oneAPI Base Toolkit for Windows from [here](https://www.intel.com/content/www/us/en/developer/tools/oneapi/base-toolkit-download.html?operatingsystem=windows&windows-install-type=offline)\n", + "\n", + "**Note: Its important you need to download the 2025.0.1 version(older)of the oneAPI Basekit as llamacpp python is not yet compatible with latest oneAPI Basekit.\n", + "When downloading the installer please select the \"Choose a Version\" dropdown and select 2025.0.1**\n", + "\n", + "7. **Miniconda for Windows**\\\n", + "Download and install Miniconda from [here](https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-Windows-x86_64.exe)\n", + "\n", + "### Linux:\n", + "\n", + "1. **GPU Drivers installation**\\\n", + "Download and install the GPU drivers from [here](https://dgpu-docs.intel.com/driver/client/overview.html)\n", + "\n", + "2. **Miniconda for Linux**\\\n", + "Download, install the Miniconda using the below commands. \n", + " ```\n", + " wget \"https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-$(uname)-$(uname -m).sh\"\n", + " bash Miniforge3-$(uname)-$(uname -m).sh\n", + " ```\n", + " Replace with your actual Miniforge bin folder path and run the cd command to go there. Initialize the conda environment and restart the terminal.\n", + " ```\n", + " cd \n", + " ```\n", + " ``` \n", + " ./conda init \n", + " ```\n", + "\n", + "3. **Intel oneAPI Base Toolkit for Linux**\\\n", + "Download and install Intel oneAPI Base Toolkit for Linux from [here](https://www.intel.com/content/www/us/en/developer/tools/oneapi/base-toolkit-download.html?packages=oneapi-toolkit&oneapi-toolkit-os=linux&oneapi-lin=offline)\n", + "\n", + "4. **CMake and Git for Linux**\\\n", + "Install the CMake using below commands:\n", + " - For Debian/Ubuntu-based systems:\n", + " ```\n", + " sudo apt update && sudo apt -y install cmake git\n", + " ```\n", + " - For RHEL/CentOS-based systems:\n", + " ```\n", + " sudo dnf update && sudo dnf -y install cmake git\n", + " ```\n", + " \n", + "## Setting up environment and LlamaCPP-python GPU backend\n", + "\n", + "Open a new Mini-forge terminal and perform the following steps:\n", + "\n", + "1. **Create and activate the conda environment**\n", + " ```\n", + " conda create -n llamacpp python=3.11 -y\n", + " conda activate llamacpp\n", + " ```\n", + "2. **Initialize oneAPI environment**\\\n", + " On Windows:\n", + " ```\n", + " @call \"C:\\Program Files (x86)\\Intel\\oneAPI\\setvars.bat\" intel64 --force\n", + " ```\n", + " On Linux:\n", + " ```\n", + " source /opt/intel/oneapi/setvars.sh --force\n", + " ```\n", + "3. **Set the environment variables and install Llamacpp-Python bindings**\\\n", + " On Windows:\n", + " ```\n", + " set CMAKE_GENERATOR=Ninja\n", + " set CMAKE_C_COMPILER=cl\n", + " set CMAKE_CXX_COMPILER=icx\n", + " set CXX=icx\n", + " set CC=cl\n", + " set CMAKE_ARGS=\"-DGGML_SYCL=ON -DGGML_SYCL_F16=ON -DCMAKE_CXX_COMPILER=icx -DCMAKE_C_COMPILER=cl\"\n", + " pip install llama-cpp-python==0.3.8 -U --force --no-cache-dir --verbose\n", + " ```\n", + " On Linux:\n", + " ```\n", + " CMAKE_ARGS=\"-DGGML_SYCL=on -DCMAKE_C_COMPILER=icx -DCMAKE_CXX_COMPILER=icpx\" pip install llama-cpp-python==0.3.8 -U --force --no-cache-dir --verbose\n", + " ```\n", + "4. **Install the required pip packages**\n", + " ```\n", + " pip install -r rag/requirements.txt\n", + " ```\n", + "5. **Install a ipykernel to select the llamacpp environment**\n", + " ```\n", + " python -m ipykernel install --user --name=llamacpp\n", + " ```\n", + "\n", + "\n", + "6. **Launch the Jupyter notebook using the below command**\n", + " ```\n", + " jupyter lab\n", + " ```\n", + " - Open the 09_rag_langchain.ipynb in the jupyter notebook, select the llamacpp kernel and run the code cells one by one in the notebook.\n" ] }, { diff --git a/LLM/README_RAG.md b/LLM/README_RAG.md new file mode 100644 index 0000000..370ff18 --- /dev/null +++ b/LLM/README_RAG.md @@ -0,0 +1,116 @@ +# Building a Retrieval-Augmented Generation (RAG) System on AI PCs + +This notebook demonstrates how to run LLM inference for a Retrieval-Augmented Generation (RAG) application locally on an AI PC. It is optimized for Intel® Core™ Ultra processors, utilizing the combined capabilities of the CPU, GPU, and NPU for efficient AI workloads. + +## Installing Prerequisites +### Windows: +The following software are to be installed prior to the setting up of Llamacpp-python SYCL backend +1. **GPU Drivers installation** + - Download and Install the GPU driver from Intel® Arc™ & Iris® Xe Graphics - Windows* [link](https://www.intel.com/content/www/us/en/download/785597/intel-arc-iris-xe-graphics-windows.html) + - (Optional) Download and Install the NPU driver from [here](https://www.intel.com/content/www/us/en/download/794734/intel-npu-driver-windows.html) + - For NPU, if the Neural processor is not available, Check the PCI device to update the driver. + Follow this document [NPU_Win_Release_Notes_v2540.pdf](https://downloadmirror.intel.com/825735/NPU_Win_Release_Notes_v2540.pdf) + + **IMPORTANT:** Reboot the system after the installation + +2. **CMake for windows**\ +Download and install the latest CMake for Windows from [here](https://cmake.org/download/) + +3. **Microsoft Visual Studio 2022 community version**\ +Download and install VS 2022 community from [here](https://visualstudio.microsoft.com/downloads/)\ +**IMPORTANT:** Please select "Desktop Development with C++" option while installing Visual studio + +4. **Git for Windows**\ +Download and install Git from [here](https://git-scm.com/downloads/win) + +5. **Intel oneAPI Base Toolkit for Windows**\ +Download and install Intel oneAPI Base Toolkit for Windows from [here](https://www.intel.com/content/www/us/en/developer/tools/oneapi/base-toolkit-download.html?operatingsystem=windows&windows-install-type=offline) + +**Note: Its important you need to download the 2025.0.1 version(older)of the oneAPI Basekit as llamacpp python is not yet compatible with latest oneAPI Basekit. +When downloading the installer please select the "Choose a Version" dropdown and select 2025.0.1** + +7. **Miniconda for Windows**\ +Download and install Miniconda from [here](https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-Windows-x86_64.exe) + +### Linux: + +1. **GPU Drivers installation**\ +Download and install the GPU drivers from [here](https://dgpu-docs.intel.com/driver/client/overview.html) + +2. **Miniconda for Linux**\ +Download, install the Miniconda using the below commands. + ``` + wget "https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-$(uname)-$(uname -m).sh" + bash Miniforge3-$(uname)-$(uname -m).sh + ``` + Replace with your actual Miniforge bin folder path and run the cd command to go there. Initialize the conda environment and restart the terminal. + ``` + cd + ``` + ``` + ./conda init + ``` + +3. **Intel oneAPI Base Toolkit for Linux**\ +Download and install Intel oneAPI Base Toolkit for Linux from [here](https://www.intel.com/content/www/us/en/developer/tools/oneapi/base-toolkit-download.html?packages=oneapi-toolkit&oneapi-toolkit-os=linux&oneapi-lin=offline) + +4. **CMake and Git for Linux**\ +Install the CMake using below commands: + - For Debian/Ubuntu-based systems: + ``` + sudo apt update && sudo apt -y install cmake git + ``` + - For RHEL/CentOS-based systems: + ``` + sudo dnf update && sudo dnf -y install cmake git + ``` + +## Setting up environment and LlamaCPP-python GPU backend + +Open a new Mini-forge terminal and perform the following steps: + +1. **Create and activate the conda environment** + ``` + conda create -n llamacpp python=3.11 -y + conda activate llamacpp + ``` +2. **Initialize oneAPI environment**\ + On Windows: + ``` + @call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat" intel64 --force + ``` + On Linux: + ``` + source /opt/intel/oneapi/setvars.sh --force + ``` +3. **Set the environment variables and install Llamacpp-Python bindings**\ + On Windows: + ``` + set CMAKE_GENERATOR=Ninja + set CMAKE_C_COMPILER=cl + set CMAKE_CXX_COMPILER=icx + set CXX=icx + set CC=cl + set CMAKE_ARGS="-DGGML_SYCL=ON -DGGML_SYCL_F16=ON -DCMAKE_CXX_COMPILER=icx -DCMAKE_C_COMPILER=cl" + pip install llama-cpp-python==0.3.8 -U --force --no-cache-dir --verbose + ``` + On Linux: + ``` + CMAKE_ARGS="-DGGML_SYCL=on -DCMAKE_C_COMPILER=icx -DCMAKE_CXX_COMPILER=icpx" pip install llama-cpp-python==0.3.8 -U --force --no-cache-dir --verbose + ``` +4. **Install the required pip packages** + ``` + pip install -r rag/requirements.txt + ``` +5. **Install a ipykernel to select the llamacpp environment** + ``` + python -m ipykernel install --user --name=llamacpp + ``` + + +6. **Launch the Jupyter notebook using the below command** + ``` + jupyter lab + ``` + - Open the [Notebook](./09_rag_langchain.ipynb), select the llamacpp kernel and run the code cells one by one in the notebook. + diff --git a/README.md b/README.md index 8610d72..82a4eb3 100644 --- a/README.md +++ b/README.md @@ -22,8 +22,7 @@ To set up your AIPC for running with Intel iGPUs, follow these essential steps: 4. Install CMake ### Hardware -- Intel® Core™ Ultra Processor - Windows 11 - +- Intel® Core™ Ultra Processor - Windows 11. ### Software - Python