Skip to content

Update README.md #221

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
May 15, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
206 changes: 115 additions & 91 deletions LLM/09_rag_langchain.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -82,98 +82,122 @@
"metadata": {},
"source": [
"## Setup llama.cpp python for Intel CPUs and GPUs\n",
"The llama.cpp SYCL backend is designed to support Intel GPU. Based on the cross-platform feature of SYCL.\n",
"\n",
"We will setup Python environment and corresponding custom kernel for Jupyter Notebook, and we will install/build llama.cpp that will be used for the RAG Application.\n",
"\n",
"### Step 1: Create and activate Python environment:\n",
"\n",
"Open Terminal, make sure mini-forge is install and create new virtual environment\n",
"\n",
"```\n",
" conda create -n llm-sycl python=3.11\n",
"\n",
" conda activate llm-sycl\n",
"\n",
"```\n",
"_Note: In case you want to remove the virtual environment, run the following command:_\n",
"```\n",
" [conda remove -n llm-sycl --all]\n",
"```\n",
"\n",
"### Step 2: Setup a custom kernel for Jupyter notebook:\n",
"\n",
"Run the following commands in the terminal to setup custom kernel for the Jupyter Notebook.\n",
"\n",
"```\n",
" conda install -c conda-forge ipykernel\n",
"\n",
" python -m ipykernel install --user --name=llm-sycl\n",
"```\n",
"_Note: In case you want to remove the custom kernel from Jupyter, run the following command:_\n",
"```\n",
" [python -m jupyter kernelspec uninstall llm-sycl]\n",
"```\n",
"\n",
"<img src=\"Assets/llm4.png\">\n",
"\n",
"### Step 3: Install and Build llama.cpp\n",
"\n",
"### For Linux\n",
"\n",
"#### 1. Enable oneAPI environment\n",
"\n",
"Make sure oneAPI Base Toolkit is installed to use the SYCL compiler for building llama.cpp\n",
"\n",
"Run the following commands in terminal to initialize oneAPI environment and check available devices:\n",
"\n",
"```\n",
" source /opt/intel/oneapi/setvars.sh\n",
" sycl-ls\n",
"```\n",
"\n",
"#### 2. Install and build llama.cpp Python\n",
"\n",
"Run the following commands in terminal to install and build llama.cpp\n",
"\n",
"```\n",
" CMAKE_ARGS=\"-DGGML_SYCL=on -DCMAKE_C_COMPILER=icx -DCMAKE_CXX_COMPILER=icpx\" pip install llama-cpp-python==0.3.1\n",
"```\n",
"\n",
"### For Windows\n",
"\n",
"#### 1. Enable oneAPI environment\n",
"\n",
"Make sure oneAPI Base Toolkit is installed to use the SYCL compiler for building llama.cpp\n",
"\n",
"Type oneapi in the windows search and then open the Intel oneAPI command prompt for Intel 64 for Visual Studio 2022 App.\n",
"\n",
"Run the following commands to initialize oneAPI environment and check available devices:\n",
"\n",
"```\n",
" @call \"C:\\Program Files (x86)\\Intel\\oneAPI\\setvars.bat\" intel64 --force\n",
" sycl-ls\n",
"```\n",
"\n",
"#### 2. Install build tools\n",
"\n",
"* Download & install [cmake for Windows](https://cmake.org/download/):\n",
"* The new Visual Studio will install Ninja as default. (If not, please install it manually: https://ninja-build.org/)\n",
"\n",
"#### 3. Install and build llama.cpp Python\n",
"\n",
"* On the oneAPI command line window, step into the llama.cpp main directory and run the following:\n",
" \n",
"```\n",
" set CMAKE_GENERATOR=Ninja\n",
" set CMAKE_C_COMPILER=cl\n",
" set CMAKE_CXX_COMPILER=icx\n",
" set CXX=icx\n",
" set CC=cl\n",
" set CMAKE_ARGS=\"-DGGML_SYCL=ON -DGGML_SYCL_F16=ON -DCMAKE_CXX_COMPILER=icx -DCMAKE_C_COMPILER=cl\"\n",
" \n",
" pip install llama-cpp-python==0.3.8 -U --force --no-cache-dir --verbose\n",
"```\n"
"In this notebook, we will set up the Python environment and configure a custom kernel for Jupyter Notebook. Additionally, we will install and build llamacpp-python for Intel GPUs, which will be utilized for the RAG application.\n",
"\n",
"For detailed setup instructions, please follow the [View README](./README_RAG.md) file in the current directory OR follow the instructions below.\n",
"\n",
"## Installing Prerequisites\n",
"### Windows:\n",
"The following software are to be installed prior to the setting up of Llamacpp-python SYCL backend\n",
"1. **GPU Drivers installation**\n",
" - Download and Install the GPU driver from Intel® Arc™ & Iris® Xe Graphics - Windows* [link](https://www.intel.com/content/www/us/en/download/785597/intel-arc-iris-xe-graphics-windows.html)\n",
" - (Optional) Download and Install the NPU driver from [here](https://www.intel.com/content/www/us/en/download/794734/intel-npu-driver-windows.html)\n",
" - For NPU, if the Neural processor is not available, Check the PCI device to update the driver.\n",
" Follow this document [NPU_Win_Release_Notes_v2540.pdf](https://downloadmirror.intel.com/825735/NPU_Win_Release_Notes_v2540.pdf)\n",
"\n",
" **IMPORTANT:** Reboot the system after the installation\n",
"\n",
"2. **CMake for windows**\\\n",
"Download and install the latest CMake for Windows from [here](https://cmake.org/download/)\n",
"\n",
"3. **Microsoft Visual Studio 2022 community version**\\\n",
"Download and install VS 2022 community from [here](https://visualstudio.microsoft.com/downloads/)\\\n",
"**IMPORTANT:** Please select \"Desktop Development with C++\" option while installing Visual studio\n",
"\n",
"4. **Git for Windows**\\\n",
"Download and install Git from [here](https://git-scm.com/downloads/win)\n",
"\n",
"5. **Intel oneAPI Base Toolkit for Windows**\\\n",
"Download and install Intel oneAPI Base Toolkit for Windows from [here](https://www.intel.com/content/www/us/en/developer/tools/oneapi/base-toolkit-download.html?operatingsystem=windows&windows-install-type=offline)\n",
"\n",
"**Note: Its important you need to download the 2025.0.1 version(older)of the oneAPI Basekit as llamacpp python is not yet compatible with latest oneAPI Basekit.\n",
"When downloading the installer please select the \"Choose a Version\" dropdown and select 2025.0.1**\n",
"\n",
"7. **Miniconda for Windows**\\\n",
"Download and install Miniconda from [here](https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-Windows-x86_64.exe)\n",
"\n",
"### Linux:\n",
"\n",
"1. **GPU Drivers installation**\\\n",
"Download and install the GPU drivers from [here](https://dgpu-docs.intel.com/driver/client/overview.html)\n",
"\n",
"2. **Miniconda for Linux**\\\n",
"Download, install the Miniconda using the below commands. \n",
" ```\n",
" wget \"https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-$(uname)-$(uname -m).sh\"\n",
" bash Miniforge3-$(uname)-$(uname -m).sh\n",
" ```\n",
" Replace </move/to/miniforge3/bin/folder> with your actual Miniforge bin folder path and run the cd command to go there. Initialize the conda environment and restart the terminal.\n",
" ```\n",
" cd </move/to/miniforge3/bin/folder>\n",
" ```\n",
" ``` \n",
" ./conda init \n",
" ```\n",
"\n",
"3. **Intel oneAPI Base Toolkit for Linux**\\\n",
"Download and install Intel oneAPI Base Toolkit for Linux from [here](https://www.intel.com/content/www/us/en/developer/tools/oneapi/base-toolkit-download.html?packages=oneapi-toolkit&oneapi-toolkit-os=linux&oneapi-lin=offline)\n",
"\n",
"4. **CMake and Git for Linux**\\\n",
"Install the CMake using below commands:\n",
" - For Debian/Ubuntu-based systems:\n",
" ```\n",
" sudo apt update && sudo apt -y install cmake git\n",
" ```\n",
" - For RHEL/CentOS-based systems:\n",
" ```\n",
" sudo dnf update && sudo dnf -y install cmake git\n",
" ```\n",
" \n",
"## Setting up environment and LlamaCPP-python GPU backend\n",
"\n",
"Open a new Mini-forge terminal and perform the following steps:\n",
"\n",
"1. **Create and activate the conda environment**\n",
" ```\n",
" conda create -n llamacpp python=3.11 -y\n",
" conda activate llamacpp\n",
" ```\n",
"2. **Initialize oneAPI environment**\\\n",
" On Windows:\n",
" ```\n",
" @call \"C:\\Program Files (x86)\\Intel\\oneAPI\\setvars.bat\" intel64 --force\n",
" ```\n",
" On Linux:\n",
" ```\n",
" source /opt/intel/oneapi/setvars.sh --force\n",
" ```\n",
"3. **Set the environment variables and install Llamacpp-Python bindings**\\\n",
" On Windows:\n",
" ```\n",
" set CMAKE_GENERATOR=Ninja\n",
" set CMAKE_C_COMPILER=cl\n",
" set CMAKE_CXX_COMPILER=icx\n",
" set CXX=icx\n",
" set CC=cl\n",
" set CMAKE_ARGS=\"-DGGML_SYCL=ON -DGGML_SYCL_F16=ON -DCMAKE_CXX_COMPILER=icx -DCMAKE_C_COMPILER=cl\"\n",
" pip install llama-cpp-python==0.3.8 -U --force --no-cache-dir --verbose\n",
" ```\n",
" On Linux:\n",
" ```\n",
" CMAKE_ARGS=\"-DGGML_SYCL=on -DCMAKE_C_COMPILER=icx -DCMAKE_CXX_COMPILER=icpx\" pip install llama-cpp-python==0.3.8 -U --force --no-cache-dir --verbose\n",
" ```\n",
"4. **Install the required pip packages**\n",
" ```\n",
" pip install -r rag/requirements.txt\n",
" ```\n",
"5. **Install a ipykernel to select the llamacpp environment**\n",
" ```\n",
" python -m ipykernel install --user --name=llamacpp\n",
" ```\n",
"\n",
"\n",
"6. **Launch the Jupyter notebook using the below command**\n",
" ```\n",
" jupyter lab\n",
" ```\n",
" - Open the 09_rag_langchain.ipynb in the jupyter notebook, select the llamacpp kernel and run the code cells one by one in the notebook.\n"
]
},
{
Expand Down
116 changes: 116 additions & 0 deletions LLM/README_RAG.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,116 @@
# Building a Retrieval-Augmented Generation (RAG) System on AI PCs

This notebook demonstrates how to run LLM inference for a Retrieval-Augmented Generation (RAG) application locally on an AI PC. It is optimized for Intel® Core™ Ultra processors, utilizing the combined capabilities of the CPU, GPU, and NPU for efficient AI workloads.

## Installing Prerequisites
### Windows:
The following software are to be installed prior to the setting up of Llamacpp-python SYCL backend
1. **GPU Drivers installation**
- Download and Install the GPU driver from Intel® Arc™ & Iris® Xe Graphics - Windows* [link](https://www.intel.com/content/www/us/en/download/785597/intel-arc-iris-xe-graphics-windows.html)
- (Optional) Download and Install the NPU driver from [here](https://www.intel.com/content/www/us/en/download/794734/intel-npu-driver-windows.html)
- For NPU, if the Neural processor is not available, Check the PCI device to update the driver.
Follow this document [NPU_Win_Release_Notes_v2540.pdf](https://downloadmirror.intel.com/825735/NPU_Win_Release_Notes_v2540.pdf)

**IMPORTANT:** Reboot the system after the installation

2. **CMake for windows**\
Download and install the latest CMake for Windows from [here](https://cmake.org/download/)

3. **Microsoft Visual Studio 2022 community version**\
Download and install VS 2022 community from [here](https://visualstudio.microsoft.com/downloads/)\
**IMPORTANT:** Please select "Desktop Development with C++" option while installing Visual studio

4. **Git for Windows**\
Download and install Git from [here](https://git-scm.com/downloads/win)

5. **Intel oneAPI Base Toolkit for Windows**\
Download and install Intel oneAPI Base Toolkit for Windows from [here](https://www.intel.com/content/www/us/en/developer/tools/oneapi/base-toolkit-download.html?operatingsystem=windows&windows-install-type=offline)

**Note: Its important you need to download the 2025.0.1 version(older)of the oneAPI Basekit as llamacpp python is not yet compatible with latest oneAPI Basekit.
When downloading the installer please select the "Choose a Version" dropdown and select 2025.0.1**

7. **Miniconda for Windows**\
Download and install Miniconda from [here](https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-Windows-x86_64.exe)

### Linux:

1. **GPU Drivers installation**\
Download and install the GPU drivers from [here](https://dgpu-docs.intel.com/driver/client/overview.html)

2. **Miniconda for Linux**\
Download, install the Miniconda using the below commands.
```
wget "https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-$(uname)-$(uname -m).sh"
bash Miniforge3-$(uname)-$(uname -m).sh
```
Replace </move/to/miniforge3/bin/folder> with your actual Miniforge bin folder path and run the cd command to go there. Initialize the conda environment and restart the terminal.
```
cd </move/to/miniforge3/bin/folder>
```
```
./conda init
```

3. **Intel oneAPI Base Toolkit for Linux**\
Download and install Intel oneAPI Base Toolkit for Linux from [here](https://www.intel.com/content/www/us/en/developer/tools/oneapi/base-toolkit-download.html?packages=oneapi-toolkit&oneapi-toolkit-os=linux&oneapi-lin=offline)

4. **CMake and Git for Linux**\
Install the CMake using below commands:
- For Debian/Ubuntu-based systems:
```
sudo apt update && sudo apt -y install cmake git
```
- For RHEL/CentOS-based systems:
```
sudo dnf update && sudo dnf -y install cmake git
```

## Setting up environment and LlamaCPP-python GPU backend

Open a new Mini-forge terminal and perform the following steps:

1. **Create and activate the conda environment**
```
conda create -n llamacpp python=3.11 -y
conda activate llamacpp
```
2. **Initialize oneAPI environment**\
On Windows:
```
@call "C:\Program Files (x86)\Intel\oneAPI\setvars.bat" intel64 --force
```
On Linux:
```
source /opt/intel/oneapi/setvars.sh --force
```
3. **Set the environment variables and install Llamacpp-Python bindings**\
On Windows:
```
set CMAKE_GENERATOR=Ninja
set CMAKE_C_COMPILER=cl
set CMAKE_CXX_COMPILER=icx
set CXX=icx
set CC=cl
set CMAKE_ARGS="-DGGML_SYCL=ON -DGGML_SYCL_F16=ON -DCMAKE_CXX_COMPILER=icx -DCMAKE_C_COMPILER=cl"
pip install llama-cpp-python==0.3.8 -U --force --no-cache-dir --verbose
```
On Linux:
```
CMAKE_ARGS="-DGGML_SYCL=on -DCMAKE_C_COMPILER=icx -DCMAKE_CXX_COMPILER=icpx" pip install llama-cpp-python==0.3.8 -U --force --no-cache-dir --verbose
```
4. **Install the required pip packages**
```
pip install -r rag/requirements.txt
```
5. **Install a ipykernel to select the llamacpp environment**
```
python -m ipykernel install --user --name=llamacpp
```


6. **Launch the Jupyter notebook using the below command**
```
jupyter lab
```
- Open the [Notebook](./09_rag_langchain.ipynb), select the llamacpp kernel and run the code cells one by one in the notebook.

3 changes: 1 addition & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,8 +22,7 @@ To set up your AIPC for running with Intel iGPUs, follow these essential steps:
4. Install CMake

### Hardware
- Intel® Core™ Ultra Processor - Windows 11

- Intel® Core™ Ultra Processor - Windows 11.

### Software
- Python
Expand Down