intel · praveenkk123 · May 14, 2025 · May 14, 2025 · May 14, 2025
diff --git a/LLM/01_native_gpu.ipynb b/LLM/01_native_gpu.ipynb
@@ -5,31 +5,34 @@
    "id": "652ea6c8-8d13-4228-853e-fad46db470f5",
    "metadata": {},
    "source": [
-    "# Inference using Llamacpp on Intel GPUs"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "71e0aeac-58b1-4114-95f1-7d3a7a4c34f2",
-   "metadata": {},
-   "source": [
-    "## Introduction\n",
-    "\n",
-    "This notebook demonstrates how to run an LLM inference on Windows with Intel GPUs. It applies to Intel Core Ultra and Core 11 - 14 gen integrated GPUs (iGPUs), as well as Intel Arc Series GPU."
+    "# Running LlamaCPP Inference on AI PCs with Intel GPUs"
    ]
   },
   {
    "cell_type": "markdown",
    "id": "97cf7db8-9529-47dd-b41d-81b22c8d5848",
    "metadata": {},
    "source": [
-    "## What is an AIPC\n",
+    "## Introduction  \n",
+    "\n",
+    "This notebook demonstrates how to run LLM inference locally on an AI PC. It is optimized for Intel® Core™ Ultra processors, utilizing the combined capabilities of the CPU, GPU, and NPU for efficient AI workloads. \n",
+    "\n",
+    "### What is an AI PC?  \n",
+    "\n",
+    "An AI PC is a next-generation computing platform equipped with a CPU, GPU, and NPU, each designed with specific AI acceleration capabilities.  \n",
+    "\n",
+    "- **Fast Response (CPU)**  \n",
+    "  The central processing unit (CPU) is optimized for smaller, low-latency workloads, making it ideal for quick responses and general-purpose tasks.  \n",
+    "\n",
+    "- **High Throughput (GPU)**  \n",
+    "  The graphics processing unit (GPU) excels at handling large-scale workloads that require high parallelism and throughput, making it suitable for tasks like deep learning and data processing.  \n",
     "\n",
-    "What is an AI PC you ask?\n",
+    "- **Power Efficiency (NPU)**  \n",
+    "  The neural processing unit (NPU) is designed for sustained, heavily-used AI workloads, delivering high efficiency and low power consumption for tasks like inference and machine learning.  \n",
     "\n",
-    "Here is an [explanation](https://www.intel.com/content/www/us/en/newsroom/news/what-is-an-ai-pc.htm#gs.a55so1) from Intel:\n",
+    "The AI PC represents a transformative shift in computing, enabling advanced AI applications to run seamlessly on local hardware. This innovation enhances everyday PC usage by delivering faster, more efficient AI experiences without relying on cloud resources.  \n",
     "\n",
-    "”An AI PC has a CPU, a GPU and an NPU, each with specific AI acceleration capabilities. An NPU, or neural processing unit, is a specialized accelerator that handles artificial intelligence (AI) and machine learning (ML) tasks right on your PC instead of sending data to be processed in the cloud. The GPU and CPU can also process these workloads, but the NPU is especially good at low-power AI calculations. The AI PC represents a fundamental shift in how our computers operate. It is not a solution for a problem that didn’t exist before. Instead, it promises to be a huge improvement for everyday PC usages.”"
+    "In this notebook, we’ll explore how to use the AI PC’s capabilities to perform LLM inference, showcasing the power of local AI acceleration for modern applications.  "
    ]
   },
   {
@@ -178,7 +181,7 @@
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
-   "version": "3.11.9"
+   "version": "3.10.13"
   }
  },
  "nbformat": 4,

diff --git a/LLM/02_ollama_gpu.ipynb b/LLM/02_ollama_gpu.ipynb
@@ -5,31 +5,34 @@
    "id": "652ea6c8-8d13-4228-853e-fad46db470f5",
    "metadata": {},
    "source": [
-    "# Running LLAMA3 on Intel AI PCs"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "71e0aeac-58b1-4114-95f1-7d3a7a4c34f2",
-   "metadata": {},
-   "source": [
-    "## Introduction\n",
-    "\n",
-    "This notebook demonstrates how to install Ollama on Windows with Intel GPUs. It applies to Intel Core Ultra and Core 11 - 14 gen integrated GPUs (iGPUs), as well as Intel Arc Series GPU."
+    "# Running Ollama Inference on Intel AI PCs"
    ]
   },
   {
    "cell_type": "markdown",
    "id": "97cf7db8-9529-47dd-b41d-81b22c8d5848",
    "metadata": {},
    "source": [
-    "## What is an AIPC\n",
+    "## Introduction  \n",
+    "\n",
+    "This notebook demonstrates how to run LLM inference locally on an AI PC using Ollama. It is optimized for Intel® Core™ Ultra processors, utilizing the combined capabilities of the CPU, GPU, and NPU for efficient AI workloads. \n",
+    "\n",
+    "### What is an AI PC?  \n",
+    "\n",
+    "An AI PC is a next-generation computing platform equipped with a CPU, GPU, and NPU, each designed with specific AI acceleration capabilities.  \n",
+    "\n",
+    "- **Fast Response (CPU)**  \n",
+    "  The central processing unit (CPU) is optimized for smaller, low-latency workloads, making it ideal for quick responses and general-purpose tasks.  \n",
+    "\n",
+    "- **High Throughput (GPU)**  \n",
+    "  The graphics processing unit (GPU) excels at handling large-scale workloads that require high parallelism and throughput, making it suitable for tasks like deep learning and data processing.  \n",
     "\n",
-    "What is an AI PC you ask?\n",
+    "- **Power Efficiency (NPU)**  \n",
+    "  The neural processing unit (NPU) is designed for sustained, heavily-used AI workloads, delivering high efficiency and low power consumption for tasks like inference and machine learning.  \n",
     "\n",
-    "Here is an [explanation](https://www.intel.com/content/www/us/en/newsroom/news/what-is-an-ai-pc.htm#gs.a55so1):\n",
+    "The AI PC represents a transformative shift in computing, enabling advanced AI applications to run seamlessly on local hardware. This innovation enhances everyday PC usage by delivering faster, more efficient AI experiences without relying on cloud resources.  \n",
     "\n",
-    "”An AI PC has a CPU, a GPU and an NPU, each with specific AI acceleration capabilities. An NPU, or neural processing unit, is a specialized accelerator that handles artificial intelligence (AI) and machine learning (ML) tasks right on your PC instead of sending data to be processed in the cloud. The GPU and CPU can also process these workloads, but the NPU is especially good at low-power AI calculations. The AI PC represents a fundamental shift in how our computers operate. It is not a solution for a problem that didn’t exist before. Instead, it promises to be a huge improvement for everyday PC usages.”"
+    "In this notebook, we’ll explore how to use the AI PC’s capabilities to perform LLM inference, showcasing the power of local AI acceleration for modern applications.  "
    ]
   },
   {
@@ -273,7 +276,7 @@
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
-   "version": "3.11.5"
+   "version": "3.10.13"
   }
  },
  "nbformat": 4,

diff --git a/LLM/03_llm_pytorch_gpu.ipynb b/LLM/03_llm_pytorch_gpu.ipynb
@@ -5,31 +5,34 @@
    "id": "4bdf80ae-10bd-438b-a5ae-76a5c5f99a6d",
    "metadata": {},
    "source": [
-    "# Inference using Pytorch on Intel GPUs"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "71e0aeac-58b1-4114-95f1-7d3a7a4c34f2",
-   "metadata": {},
-   "source": [
-    "## Introduction\n",
-    "\n",
-    "This notebook demonstrates how to run LLM inference using pytorch on Windows with Intel GPUs. It applies to Intel Core Ultra and Core 11 - 14 gen integrated GPUs (iGPUs), as well as Intel Arc Series GPU."
+    "# PyTorch Inference on AI PCs with Intel GPUs"
    ]
   },
   {
    "cell_type": "markdown",
    "id": "97cf7db8-9529-47dd-b41d-81b22c8d5848",
    "metadata": {},
    "source": [
-    "## What is an AIPC\n",
+    "## Introduction  \n",
+    "\n",
+    "This notebook demonstrates how to run LLM inference locally on an AI PC using Pytorch. It is optimized for Intel® Core™ Ultra processors, utilizing the combined capabilities of the CPU, GPU, and NPU for efficient AI workloads. \n",
+    "\n",
+    "### What is an AI PC?  \n",
+    "\n",
+    "An AI PC is a next-generation computing platform equipped with a CPU, GPU, and NPU, each designed with specific AI acceleration capabilities.  \n",
+    "\n",
+    "- **Fast Response (CPU)**  \n",
+    "  The central processing unit (CPU) is optimized for smaller, low-latency workloads, making it ideal for quick responses and general-purpose tasks.  \n",
+    "\n",
+    "- **High Throughput (GPU)**  \n",
+    "  The graphics processing unit (GPU) excels at handling large-scale workloads that require high parallelism and throughput, making it suitable for tasks like deep learning and data processing.  \n",
     "\n",
-    "What is an AI PC you ask?\n",
+    "- **Power Efficiency (NPU)**  \n",
+    "  The neural processing unit (NPU) is designed for sustained, heavily-used AI workloads, delivering high efficiency and low power consumption for tasks like inference and machine learning.  \n",
     "\n",
-    "Here is an [explanation](https://www.intel.com/content/www/us/en/newsroom/news/what-is-an-ai-pc.htm#gs.a55so1):\n",
+    "The AI PC represents a transformative shift in computing, enabling advanced AI applications to run seamlessly on local hardware. This innovation enhances everyday PC usage by delivering faster, more efficient AI experiences without relying on cloud resources.  \n",
     "\n",
-    "”An AI PC has a CPU, a GPU and an NPU, each with specific AI acceleration capabilities. An NPU, or neural processing unit, is a specialized accelerator that handles artificial intelligence (AI) and machine learning (ML) tasks right on your PC instead of sending data to be processed in the cloud. The GPU and CPU can also process these workloads, but the NPU is especially good at low-power AI calculations. The AI PC represents a fundamental shift in how our computers operate. It is not a solution for a problem that didn’t exist before. Instead, it promises to be a huge improvement for everyday PC usages.”"
+    "In this notebook, we’ll explore how to use the AI PC’s capabilities to perform LLM inference, showcasing the power of local AI acceleration for modern applications.  "
    ]
   },
   {
@@ -500,7 +503,7 @@
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
-   "version": "3.11.5"
+   "version": "3.10.13"
   }
  },
  "nbformat": 4,

diff --git a/LLM/04_llm-rag.ipynb b/LLM/04_llm-rag.ipynb
@@ -6,7 +6,28 @@
    "id": "02a561f4",
    "metadata": {},
    "source": [
-    "# Create a RAG system on AIPC\n",
+    "# Create a RAG system on AIPC using Ollama\n",
+    "\n",
+    "## Introduction  \n",
+    "\n",
+    "This notebook demonstrates how to run LLM inference for a Retrieval-Augmented Generation (RAG) application using Ollama locally on an AI PC. It is optimized for Intel® Core™ Ultra processors, utilizing the combined capabilities of the CPU, GPU, and NPU for efficient AI workloads. \n",
+    "\n",
+    "### What is an AI PC?  \n",
+    "\n",
+    "An AI PC is a next-generation computing platform equipped with a CPU, GPU, and NPU, each designed with specific AI acceleration capabilities.  \n",
+    "\n",
+    "- **Fast Response (CPU)**  \n",
+    "  The central processing unit (CPU) is optimized for smaller, low-latency workloads, making it ideal for quick responses and general-purpose tasks.  \n",
+    "\n",
+    "- **High Throughput (GPU)**  \n",
+    "  The graphics processing unit (GPU) excels at handling large-scale workloads that require high parallelism and throughput, making it suitable for tasks like deep learning and data processing.  \n",
+    "\n",
+    "- **Power Efficiency (NPU)**  \n",
+    "  The neural processing unit (NPU) is designed for sustained, heavily-used AI workloads, delivering high efficiency and low power consumption for tasks like inference and machine learning.  \n",
+    "\n",
+    "The AI PC represents a transformative shift in computing, enabling advanced AI applications like LLM-based RAG workflows to run seamlessly on local hardware. This innovation enhances everyday PC usage by delivering faster, more efficient AI experiences without relying on cloud resources.  \n",
+    "\n",
+    "In this notebook, we’ll explore how to use the AI PC’s capabilities to perform LLM inference and integrate it into a RAG pipeline, showcasing the power of local AI acceleration for modern applications. \n",
     "\n",
     "**Retrieval-augmented generation (RAG)** is a technique for augmenting LLM knowledge with additional, often private or real-time, data. LLMs can reason about wide-ranging topics, but their knowledge is limited to the public data up to a specific point in time that they were trained on. If you want to build AI applications that can reason about private data or data introduced after a model’s cutoff date, you need to augment the knowledge of the model with the specific information it needs. The process of bringing the appropriate information and inserting it into the model prompt is known as Retrieval Augmented Generation (RAG)."
    ]
@@ -512,7 +533,7 @@
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
-   "version": "3.11.5"
+   "version": "3.10.13"
   },
   "openvino_notebooks": {
    "imageUrl": "https://github.com/openvinotoolkit/openvino_notebooks/assets/29454499/304aa048-f10c-41c6-bb31-6d2bfdf49cf5",

diff --git a/LLM/05_llm_quantization_sycl.ipynb b/LLM/05_llm_quantization_sycl.ipynb
@@ -5,31 +5,34 @@
    "id": "652ea6c8-8d13-4228-853e-fad46db470f5",
    "metadata": {},
    "source": [
-    "# Quantization using SYCL backend on AI PC"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "71e0aeac-58b1-4114-95f1-7d3a7a4c34f2",
-   "metadata": {},
-   "source": [
-    "## Introduction\n",
-    "\n",
-    "This notebook demonstrates how to quantize a model on Windows AI PC with Intel GPUs. It applies to Intel Core Ultra and Core 11 - 14 gen integrated GPUs (iGPUs), as well as Intel Arc Series GPU."
+    "# Quantization for Efficient Local Inference on AI PCs"
    ]
   },
   {
    "cell_type": "markdown",
    "id": "97cf7db8-9529-47dd-b41d-81b22c8d5848",
    "metadata": {},
    "source": [
-    "## What is an AIPC\n",
+    "## Introduction  \n",
+    "\n",
+    "This notebook demonstrates how to how to quantize a model locally on an AI PC. It is optimized for Intel® Core™ Ultra processors, utilizing the combined capabilities of the CPU, GPU, and NPU for efficient AI workloads. \n",
+    "\n",
+    "### What is an AI PC?  \n",
+    "\n",
+    "An AI PC is a next-generation computing platform equipped with a CPU, GPU, and NPU, each designed with specific AI acceleration capabilities.  \n",
+    "\n",
+    "- **Fast Response (CPU)**  \n",
+    "  The central processing unit (CPU) is optimized for smaller, low-latency workloads, making it ideal for quick responses and general-purpose tasks.  \n",
+    "\n",
+    "- **High Throughput (GPU)**  \n",
+    "  The graphics processing unit (GPU) excels at handling large-scale workloads that require high parallelism and throughput, making it suitable for tasks like deep learning and data processing.  \n",
     "\n",
-    "What is an AI PC you ask?\n",
+    "- **Power Efficiency (NPU)**  \n",
+    "  The neural processing unit (NPU) is designed for sustained, heavily-used AI workloads, delivering high efficiency and low power consumption for tasks like inference and machine learning.  \n",
     "\n",
-    "Here is an [explanation](https://www.intel.com/content/www/us/en/newsroom/news/what-is-an-ai-pc.htm#gs.a55so1) from Intel:\n",
+    "The AI PC represents a transformative shift in computing, enabling advanced AI applications and workflows to run seamlessly on local hardware. This innovation enhances everyday PC usage by delivering faster, more efficient AI experiences without relying on cloud resources.  \n",
     "\n",
-    "”An AI PC has a CPU, a GPU and an NPU, each with specific AI acceleration capabilities. An NPU, or neural processing unit, is a specialized accelerator that handles artificial intelligence (AI) and machine learning (ML) tasks right on your PC instead of sending data to be processed in the cloud. The GPU and CPU can also process these workloads, but the NPU is especially good at low-power AI calculations. The AI PC represents a fundamental shift in how our computers operate. It is not a solution for a problem that didn’t exist before. Instead, it promises to be a huge improvement for everyday PC usages.”"
+    "In this notebook, we’ll explore how to use the AI PC’s capabilities to perform LLM inference and integrate it into a RAG pipeline, showcasing the power of local AI acceleration for modern applications.  "
    ]
   },
   {
@@ -591,7 +594,7 @@
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
-   "version": "3.11.9"
+   "version": "3.10.13"
   }
  },
  "nbformat": 4,

diff --git a/LLM/06_llm_sycl_gpu.ipynb b/LLM/06_llm_sycl_gpu.ipynb
@@ -5,31 +5,33 @@
    "id": "652ea6c8-8d13-4228-853e-fad46db470f5",
    "metadata": {},
    "source": [
-    "# Inference using SYCL backend on AI PC"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "id": "71e0aeac-58b1-4114-95f1-7d3a7a4c34f2",
-   "metadata": {},
-   "source": [
-    "## Introduction\n",
-    "\n",
-    "This notebook demonstrates how to install LLamacpp for SYCL on Windows with Intel GPUs. It applies to Intel Core Ultra and Core 11 - 14 gen integrated GPUs (iGPUs), as well as Intel Arc Series GPU."
+    "# Inference using Native LlamaCPP on AI PCs with Intel GPUs"
    ]
   },
   {
    "cell_type": "markdown",
    "id": "97cf7db8-9529-47dd-b41d-81b22c8d5848",
    "metadata": {},
    "source": [
-    "## What is an AIPC\n",
+    "## Introduction  \n",
+    "This notebook demonstrates how to install LLamacpp native binaries and run LLM inference locally on an AI PC. It is optimized for Intel® Core™ Ultra processors, utilizing the combined capabilities of the CPU, GPU, and NPU for efficient AI workloads. \n",
+    "\n",
+    "### What is an AI PC?  \n",
+    "\n",
+    "An AI PC is a next-generation computing platform equipped with a CPU, GPU, and NPU, each designed with specific AI acceleration capabilities.  \n",
+    "\n",
+    "- **Fast Response (CPU)**  \n",
+    "  The central processing unit (CPU) is optimized for smaller, low-latency workloads, making it ideal for quick responses and general-purpose tasks.  \n",
+    "\n",
+    "- **High Throughput (GPU)**  \n",
+    "  The graphics processing unit (GPU) excels at handling large-scale workloads that require high parallelism and throughput, making it suitable for tasks like deep learning and data processing.  \n",
     "\n",
-    "What is an AI PC you ask?\n",
+    "- **Power Efficiency (NPU)**  \n",
+    "  The neural processing unit (NPU) is designed for sustained, heavily-used AI workloads, delivering high efficiency and low power consumption for tasks like inference and machine learning.  \n",
     "\n",
-    "Here is an [explanation](https://www.intel.com/content/www/us/en/newsroom/news/what-is-an-ai-pc.htm#gs.a55so1) from Intel:\n",
+    "The AI PC represents a transformative shift in computing, enabling advanced AI applications and AI workflows to run seamlessly on local hardware. This innovation enhances everyday PC usage by delivering faster, more efficient AI experiences without relying on cloud resources.  \n",
     "\n",
-    "”An AI PC has a CPU, a GPU and an NPU, each with specific AI acceleration capabilities. An NPU, or neural processing unit, is a specialized accelerator that handles artificial intelligence (AI) and machine learning (ML) tasks right on your PC instead of sending data to be processed in the cloud. The GPU and CPU can also process these workloads, but the NPU is especially good at low-power AI calculations. The AI PC represents a fundamental shift in how our computers operate. It is not a solution for a problem that didn’t exist before. Instead, it promises to be a huge improvement for everyday PC usages.”"
+    "In this notebook, we’ll explore how to use the AI PC’s capabilities to perform LLM inference, showcasing the power of local AI acceleration for modern applications.  "
    ]
   },
   {
@@ -240,7 +242,7 @@
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
-   "version": "3.11.9"
+   "version": "3.10.13"
   }
  },
  "nbformat": 4,