intel · praveenkk123 · May 15, 2025 · May 15, 2025 · May 15, 2025 · May 15, 2025
diff --git a/LLM/09_rag_langchain.ipynb b/LLM/09_rag_langchain.ipynb
@@ -82,98 +82,122 @@
    "metadata": {},
    "source": [
     "## Setup llama.cpp python for Intel CPUs and GPUs\n",
-    "The llama.cpp SYCL backend is designed to support Intel GPU. Based on the cross-platform feature of SYCL.\n",
     "\n",
-    "We will setup Python environment and corresponding custom kernel for Jupyter Notebook, and we will install/build llama.cpp that will be used for the RAG Application.\n",
-    "\n",
-    "### Step 1: Create and activate Python environment:\n",
-    "\n",
-    "Open Terminal, make sure mini-forge is install and create new virtual environment\n",
-    "\n",
-    "```\n",
-    "    conda create -n llm-sycl python=3.11\n",
-    "\n",
-    "    conda activate llm-sycl\n",
-    "\n",
-    "```\n",
-    "_Note: In case you want to remove the virtual environment, run the following command:_\n",
-    "```\n",
-    "    [conda remove -n llm-sycl --all]\n",
-    "```\n",
-    "\n",
-    "### Step 2: Setup a custom kernel for Jupyter notebook:\n",
-    "\n",
-    "Run the following commands in the terminal to setup custom kernel for the Jupyter Notebook.\n",
-    "\n",
-    "```\n",
-    "    conda install -c conda-forge ipykernel\n",
-    "\n",
-    "    python -m ipykernel install --user --name=llm-sycl\n",
-    "```\n",
-    "_Note: In case you want to remove the custom kernel from Jupyter, run the following command:_\n",
-    "```\n",
-    "    [python -m jupyter kernelspec uninstall llm-sycl]\n",
-    "```\n",
-    "\n",
-    "<img src=\"Assets/llm4.png\">\n",
-    "\n",
-    "### Step 3: Install and Build llama.cpp\n",
-    "\n",
-    "### For Linux\n",
-    "\n",
-    "#### 1. Enable oneAPI environment\n",
-    "\n",
-    "Make sure oneAPI Base Toolkit is installed to use the SYCL compiler for building llama.cpp\n",
-    "\n",
-    "Run the following commands in terminal to initialize oneAPI environment and check available devices:\n",
-    "\n",
-    "```\n",
-    "    source /opt/intel/oneapi/setvars.sh\n",
-    "    sycl-ls\n",
-    "```\n",
-    "\n",
-    "#### 2. Install and build llama.cpp Python\n",
-    "\n",
-    "Run the following commands in terminal to install and build llama.cpp\n",
-    "\n",
-    "```\n",
-    "    CMAKE_ARGS=\"-DGGML_SYCL=on -DCMAKE_C_COMPILER=icx -DCMAKE_CXX_COMPILER=icpx\" pip install llama-cpp-python==0.3.1\n",
-    "```\n",
-    "\n",
-    "### For Windows\n",
-    "\n",
-    "#### 1. Enable oneAPI environment\n",
-    "\n",
-    "Make sure oneAPI Base Toolkit is installed to use the SYCL compiler for building llama.cpp\n",
-    "\n",
-    "Type oneapi in the windows search and then open the Intel oneAPI command prompt for Intel 64 for Visual Studio 2022 App.\n",
-    "\n",
-    "Run the following commands to initialize oneAPI environment and check available devices:\n",
-    "\n",
-    "```\n",
-    "    @call \"C:\\Program Files (x86)\\Intel\\oneAPI\\setvars.bat\" intel64 --force\n",
-    "    sycl-ls\n",
-    "```\n",
-    "\n",
-    "#### 2. Install build tools\n",
-    "\n",
-    "* Download & install [cmake for Windows](https://cmake.org/download/):\n",
-    "* The new Visual Studio will install Ninja as default. (If not, please install it manually: https://ninja-build.org/)\n",
-    "\n",
-    "#### 3. Install and build llama.cpp Python\n",
-    "\n",
-    "* On the oneAPI command line window, step into the llama.cpp main directory and run the following:\n",
-    "  \n",
-    "```\n",
-    "    set CMAKE_GENERATOR=Ninja\n",
-    "    set CMAKE_C_COMPILER=cl\n",
-    "    set CMAKE_CXX_COMPILER=icx\n",
-    "    set CXX=icx\n",
-    "    set CC=cl\n",
-    "    set CMAKE_ARGS=\"-DGGML_SYCL=ON -DGGML_SYCL_F16=ON -DCMAKE_CXX_COMPILER=icx -DCMAKE_C_COMPILER=cl\"\n",
-    "    \n",
-    "    pip install llama-cpp-python==0.3.8 -U --force --no-cache-dir --verbose\n",
-    "```\n"
+    "In this notebook, we will set up the Python environment and configure a custom kernel for Jupyter Notebook. Additionally, we will install and build llamacpp-python for Intel GPUs, which will be utilized for the RAG application.\n",
+    "\n",
+    "For detailed setup instructions, please follow the [View README](./README_RAG.md) file in the current directory OR follow the instructions below.\n",
+    "\n",
+    "## Installing Prerequisites\n",
+    "### Windows:\n",
+    "The following software are to be installed prior to the setting up of Llamacpp-python SYCL backend\n",
+    "1. **GPU Drivers installation**\n",
+    "    - Download and Install the GPU driver from Intel® Arc™ & Iris® Xe Graphics - Windows* [link](https://www.intel.com/content/www/us/en/download/785597/intel-arc-iris-xe-graphics-windows.html)\n",
+    "    - (Optional) Download and Install the NPU driver from [here](https://www.intel.com/content/www/us/en/download/794734/intel-npu-driver-windows.html)\n",
+    "    - For NPU, if the Neural processor is not available, Check the PCI device to update the driver.\n",
+    "      Follow this document [NPU_Win_Release_Notes_v2540.pdf](https://downloadmirror.intel.com/825735/NPU_Win_Release_Notes_v2540.pdf)\n",
+    "\n",
+    "    **IMPORTANT:** Reboot the system after the installation\n",
+    "\n",
+    "2. **CMake for windows**\\\n",
+    "Download and install the latest CMake for Windows from [here](https://cmake.org/download/)\n",
+    "\n",
+    "3. **Microsoft Visual Studio 2022 community version**\\\n",
+    "Download and install VS 2022 community from [here](https://visualstudio.microsoft.com/downloads/)\\\n",
+    "**IMPORTANT:** Please select \"Desktop Development with C++\" option while installing Visual studio\n",
+    "\n",
+    "4. **Git for Windows**\\\n",
+    "Download and install Git from [here](https://git-scm.com/downloads/win)\n",
+    "\n",
+    "5. **Intel oneAPI Base Toolkit for Windows**\\\n",
+    "Download and install Intel oneAPI Base Toolkit for Windows from [here](https://www.intel.com/content/www/us/en/developer/tools/oneapi/base-toolkit-download.html?operatingsystem=windows&windows-install-type=offline)\n",
+    "\n",
+    "**Note: Its important you need to download the 2025.0.1 version(older)of the oneAPI Basekit as llamacpp python is not yet compatible with latest oneAPI Basekit.\n",
+    "When downloading the installer please select the \"Choose a Version\" dropdown and select 2025.0.1**\n",
+    "\n",
+    "7. **Miniconda for Windows**\\\n",
+    "Download and install Miniconda from [here](https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-Windows-x86_64.exe)\n",
+    "\n",
+    "### Linux:\n",
+    "\n",
+    "1. **GPU Drivers installation**\\\n",
+    "Download and install the GPU drivers from [here](https://dgpu-docs.intel.com/driver/client/overview.html)\n",
+    "\n",
+    "2. **Miniconda for Linux**\\\n",
+    "Download, install the Miniconda using the below commands. \n",
+    "    ```\n",
+    "    wget \"https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-$(uname)-$(uname -m).sh\"\n",
+    "    bash Miniforge3-$(uname)-$(uname -m).sh\n",
+    "    ```\n",
+    "    Replace </move/to/miniforge3/bin/folder> with your actual Miniforge bin folder path and run the cd command to go there. Initialize the conda environment and restart the terminal.\n",
+    "    ```\n",
+    "    cd </move/to/miniforge3/bin/folder>\n",
+    "    ```\n",
+    "    ``` \n",
+    "    ./conda init \n",
+    "    ```\n",
+    "\n",
+    "3. **Intel oneAPI Base Toolkit for Linux**\\\n",
+    "Download and install Intel oneAPI Base Toolkit for Linux from [here](https://www.intel.com/content/www/us/en/developer/tools/oneapi/base-toolkit-download.html?packages=oneapi-toolkit&oneapi-toolkit-os=linux&oneapi-lin=offline)\n",
+    "\n",
+    "4. **CMake and Git for Linux**\\\n",
+    "Install the CMake using below commands:\n",
+    "    - For Debian/Ubuntu-based systems:\n",
+    "      ```\n",
+    "      sudo apt update && sudo apt -y install cmake git\n",
+    "      ```\n",
+    "    - For RHEL/CentOS-based systems:\n",
+    "      ```\n",
+    "      sudo dnf update && sudo dnf -y install cmake git\n",
+    "      ```\n",
+    " \n",
+    "## Setting up environment and LlamaCPP-python GPU backend\n",
+    "\n",
+    "Open a new Mini-forge terminal and perform the following steps:\n",
+    "\n",
+    "1. **Create and activate the conda environment**\n",
+    "   ```\n",
+    "   conda create -n llamacpp python=3.11 -y\n",
+    "   conda activate llamacpp\n",
+    "   ```\n",
+    "2. **Initialize oneAPI environment**\\\n",
+    "   On Windows:\n",
+    "   ```\n",
+    "   @call \"C:\\Program Files (x86)\\Intel\\oneAPI\\setvars.bat\" intel64 --force\n",
+    "   ```\n",
+    "   On Linux:\n",
+    "   ```\n",
+    "   source /opt/intel/oneapi/setvars.sh --force\n",
+    "   ```\n",
+    "3. **Set the environment variables and install Llamacpp-Python bindings**\\\n",
+    "   On Windows:\n",
+    "   ```\n",
+    "   set CMAKE_GENERATOR=Ninja\n",
+    "   set CMAKE_C_COMPILER=cl\n",
+    "   set CMAKE_CXX_COMPILER=icx\n",
+    "   set CXX=icx\n",
+    "   set CC=cl\n",
+    "   set CMAKE_ARGS=\"-DGGML_SYCL=ON -DGGML_SYCL_F16=ON -DCMAKE_CXX_COMPILER=icx -DCMAKE_C_COMPILER=cl\"\n",
+    "   pip install llama-cpp-python==0.3.8 -U --force --no-cache-dir --verbose\n",
+    "   ```\n",
+    "   On Linux:\n",
+    "   ```\n",
+    "   CMAKE_ARGS=\"-DGGML_SYCL=on -DCMAKE_C_COMPILER=icx -DCMAKE_CXX_COMPILER=icpx\" pip install llama-cpp-python==0.3.8 -U --force --no-cache-dir --verbose\n",
+    "   ```\n",
+    "4. **Install the required pip packages**\n",
+    "   ```\n",
+    "   pip install -r rag/requirements.txt\n",
+    "   ```\n",
+    "5. **Install a ipykernel to select the llamacpp environment**\n",
+    "   ```\n",
+    "   python -m ipykernel install --user --name=llamacpp\n",
+    "   ```\n",
+    "\n",
+    "\n",
+    "6. **Launch the Jupyter notebook using the below command**\n",
+    "   ```\n",
+    "   jupyter lab\n",
+    "   ```\n",
+    "   - Open the 09_rag_langchain.ipynb in the jupyter notebook, select the llamacpp kernel and run the code cells one by one in the notebook.\n"
    ]
   },
   {

diff --git a/LLM/README_RAG.md b/LLM/README_RAG.md
@@ -0,0 +1,116 @@
+# Building a Retrieval-Augmented Generation (RAG) System on AI PCs
+
+This notebook demonstrates how to run LLM inference for a Retrieval-Augmented Generation (RAG) application locally on an AI PC. It is optimized for Intel® Core™ Ultra processors, utilizing the combined capabilities of the CPU, GPU, and NPU for efficient AI workloads.
+
+## Installing Prerequisites
+### Windows:
+The following software are to be installed prior to the setting up of Llamacpp-python SYCL backend
+1. **GPU Drivers installation**
+    - Download and Install the GPU driver from Intel® Arc™ & Iris® Xe Graphics - Windows* [link](https://www.intel.com/content/www/us/en/download/785597/intel-arc-iris-xe-graphics-windows.html)
+    - (Optional) Download and Install the NPU driver from [here](https://www.intel.com/content/www/us/en/download/794734/intel-npu-driver-windows.html)
+    - For NPU, if the Neural processor is not available, Check the PCI device to update the driver.
+      Follow this document [NPU_Win_Release_Notes_v2540.pdf](https://downloadmirror.intel.com/825735/NPU_Win_Release_Notes_v2540.pdf)
+
+    **IMPORTANT:** Reboot the system after the installation
+
+2. **CMake for windows**\
+Download and install the latest CMake for Windows from [here](https://cmake.org/download/)
+
+3. **Microsoft Visual Studio 2022 community version**\
+Download and install VS 2022 community from [here](https://visualstudio.microsoft.com/downloads/)\
+**IMPORTANT:** Please select "Desktop Development with C++" option while installing Visual studio
+
+4. **Git for Windows**\
+Download and install Git from [here](https://git-scm.com/downloads/win)
+
+5. **Intel oneAPI Base Toolkit for Windows**\
+Download and install Intel oneAPI Base Toolkit for Windows from [here](https://www.intel.com/content/www/us/en/developer/tools/oneapi/base-toolkit-download.html?operatingsystem=windows&windows-install-type=offline)
+
+**Note: Its important you need to download the 2025.0.1 version(older)of the oneAPI Basekit as llamacpp python is not yet compatible with latest oneAPI Basekit.
+When downloading the installer please select the "Choose a Version" dropdown and select 2025.0.1**
+
+7. **Miniconda for Windows**\
+Download and install Miniconda from [here](https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-Windows-x86_64.exe)
+
+### Linux:
+
+1. **GPU Drivers installation**\
+Download and install the GPU drivers from [here](https://dgpu-docs.intel.com/driver/client/overview.html)
+
+2. **Miniconda for Linux**\
+Download, install the Miniconda using the below commands. 
+    ```
+    wget "https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-$(uname)-$(uname -m).sh"
+    bash Miniforge3-$(uname)-$(uname -m).sh
+    ```
+    Replace </move/to/miniforge3/bin/folder> with your actual Miniforge bin folder path and run the cd command to go there. Initialize the conda environment and restart the terminal.
+    ```
+    cd </move/to/miniforge3/bin/folder>
+    ```
+    ``` 
+    ./conda init 
+    ```
+
+3. **Intel oneAPI Base Toolkit for Linux**\
+Download and install Intel oneAPI Base Toolkit for Linux from [here](https://www.intel.com/content/www/us/en/developer/tools/oneapi/base-toolkit-download.html?packages=oneapi-toolkit&oneapi-toolkit-os=linux&oneapi-lin=offline)
+
+4. **CMake and Git for Linux**\
+Install the CMake using below commands:
+    - For Debian/Ubuntu-based systems:
+      ```
+      sudo apt update && sudo apt -y install cmake git
+      ```
+    - For RHEL/CentOS-based systems:
+      ```
+      sudo dnf update && sudo dnf -y install cmake git
+      ```
+
+## Setting up environment and LlamaCPP-python GPU backend
+
+Open a new Mini-forge terminal and perform the following steps:
+
+1. **Create and activate the conda environment**
+   ```
+   conda create -n llamacpp python=3.11 -y
+   conda activate llamacpp
+   ```
+2. **Initialize oneAPI environment**\
+   On Windows:
+   ```
+   @call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat" intel64 --force
+   ```
+   On Linux:
+   ```
+   source /opt/intel/oneapi/setvars.sh --force
+   ```
+3. **Set the environment variables and install Llamacpp-Python bindings**\
+   On Windows:
+   ```
+   set CMAKE_GENERATOR=Ninja
+   set CMAKE_C_COMPILER=cl
+   set CMAKE_CXX_COMPILER=icx
+   set CXX=icx
+   set CC=cl
+   set CMAKE_ARGS="-DGGML_SYCL=ON -DGGML_SYCL_F16=ON -DCMAKE_CXX_COMPILER=icx -DCMAKE_C_COMPILER=cl"
+   pip install llama-cpp-python==0.3.8 -U --force --no-cache-dir --verbose
+   ```
+   On Linux:
+   ```
+   CMAKE_ARGS="-DGGML_SYCL=on -DCMAKE_C_COMPILER=icx -DCMAKE_CXX_COMPILER=icpx" pip install llama-cpp-python==0.3.8 -U --force --no-cache-dir --verbose
+   ```
+4. **Install the required pip packages**
+   ```
+   pip install -r rag/requirements.txt
+   ```
+5. **Install a ipykernel to select the llamacpp environment**
+   ```
+   python -m ipykernel install --user --name=llamacpp
+   ```
+
+
+6. **Launch the Jupyter notebook using the below command**
+   ```
+   jupyter lab
+   ```
+   - Open the [Notebook](./09_rag_langchain.ipynb), select the llamacpp kernel and run the code cells one by one in the notebook.
+
diff --git a/README.md b/README.md
@@ -22,8 +22,7 @@ To set up your AIPC for running with Intel iGPUs, follow these essential steps:
 4. Install CMake
 
 ### Hardware
-- Intel® Core™ Ultra Processor - Windows 11
-
+- Intel® Core™ Ultra Processor - Windows 11.
 
 ### Software
 - Python