Skip to content

Commit 829c8fc

Browse files
author
Lu Chen
authored
Merge pull request #9216 from MicrosoftDocs/main
to live
2 parents 992a9b7 + 2463f85 commit 829c8fc

File tree

3 files changed

+55
-8
lines changed

3 files changed

+55
-8
lines changed

Office/Client/clienttoc/toc.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
items:
2-
- name: Office Products Troubleshooting
2+
- name: Microsoft 365 Apps Troubleshooting
33
href: ../office-client-welcome.yml
44
items:
55
- name: Access

support/azure/azure-kubernetes/availability-performance/identify-memory-saturation-aks.md

Lines changed: 54 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
---
22
title: Troubleshoot memory saturation in AKS clusters
33
description: Troubleshoot memory saturation in Azure Kubernetes Service (AKS) clusters across namespaces and containers. Learn how to identify the hosting node.
4-
ms.date: 08/30/2024
4+
ms.date: 06/27/2025
55
editor: v-jsitser
66
ms.reviewer: chiragpa, aritraghosh, v-leedennis
77
ms.service: azure-kubernetes-service
@@ -14,6 +14,7 @@ This article discusses methods for troubleshooting memory saturation issues. Mem
1414
## Prerequisites
1515

1616
- The Kubernetes [kubectl](https://kubernetes.io/docs/reference/kubectl/overview/) command-line tool. To install kubectl by using [Azure CLI](/cli/azure/install-azure-cli), run the [az aks install-cli](/cli/azure/aks#az-aks-install-cli) command.
17+
- The open source project [Inspektor Gadget](/troubleshoot/azure/azure-kubernetes/logs/capture-system-insights-from-aks#what-is-inspektor-gadget) for advanced process level memory analysis. For more information, see [How to install Inspektor Gadget in an AKS cluster](/troubleshoot/azure/azure-kubernetes/logs/capture-system-insights-from-aks#how-to-install-inspektor-gadget-in-an-aks-cluster).
1718

1819
## Symptoms
1920

@@ -88,7 +89,7 @@ This procedure uses the kubectl commands in a console. It displays only the curr
8889
aks-testmemory-30616462-vmss000002 74m 3% 1715Mi 31%
8990
```
9091

91-
1. Get the list of pods that are running on the node and their memory usage by running the [kubectl get pods](https://kubernetes.io/docs/reference/generated/kubectl/kubectl-commands#get) and [kubectl top pods](https://kubernetes.io/docs/reference/generated/kubectl/kubectl-commands#-em-pod-em-) commands:
92+
2. Get the list of pods that are running on the node and their memory usage by running the [kubectl get pods](https://kubernetes.io/docs/reference/generated/kubectl/kubectl-commands#get) and [kubectl top pods](https://kubernetes.io/docs/reference/generated/kubectl/kubectl-commands#-em-pod-em-) commands:
9293

9394
```bash
9495
kubectl get pods --all-namespaces --output wide \
@@ -125,7 +126,7 @@ This procedure uses the kubectl commands in a console. It displays only the curr
125126
ama-logs-w5bmd 12m 403Mi
126127
```
127128

128-
1. Review the requests and limits for each pod on the node by running the [kubectl describe node](https://kubernetes.io/docs/reference/generated/kubectl/kubectl-commands#describe) command:
129+
3. Review the requests and limits for each pod on the node by running the [kubectl describe node](https://kubernetes.io/docs/reference/generated/kubectl/kubectl-commands#describe) command:
129130

130131
```bash
131132
kubectl describe node <node-name>
@@ -158,9 +159,57 @@ This procedure uses the kubectl commands in a console. It displays only the curr
158159
159160
---
160161

161-
Now that you've identified the pods that are using high memory, you can identify the applications that are running on the pod.
162+
Now that you've identified the pods that are using high memory, you can identify the applications that are running on the pod or identify processes that may be consuming excess memory.
162163

163-
### Step 2: Review best practices to avoid memory saturation
164+
### Step 2: Identify process level memory usage
165+
166+
For advanced process level memory analysis, use [Inspektor Gadget](https://go.microsoft.com/fwlink/?linkid=2260072) to monitor real time memory usage at the process level within pods:
167+
168+
1. Install Inspektor Gadget using the instructions found in the [documentation](/troubleshoot/azure/azure-kubernetes/logs/capture-system-insights-from-aks#how-to-install-inspektor-gadget-in-an-aks-cluster)
169+
170+
2. Run the [top_process gadget](https://aka.ms/igtopprocess) to identify processes that are using large amounts of memory. You can use `--fields` to select certain columns and `--filter` to filter events based on specific field values, for example the pod names of previously identified pods with high memory consumption. You can also:
171+
172+
- Identify top 10 memory-consuming processes across the cluster:
173+
174+
```bash
175+
kubectl gadget run top_process --sort -memoryRelative --max-entries 10
176+
```
177+
178+
- Identify top memory-consuming processes on a specific node:
179+
180+
```bash
181+
kubectl gadget run top_process --sort -memoryRelative --filter k8s.node==<node-name>
182+
```
183+
184+
- Identify top memory-consuming processes in a specific namespace:
185+
186+
```bash
187+
kubectl gadget run top_process --sort -memoryRelative --filter k8s.namespace==<namespace>
188+
```
189+
190+
- Identify top memory-consuming processes in a specific pod:
191+
192+
```bash
193+
kubectl gadget run top_process --sort -memoryRelative --filter k8s.podName==<pod-name>
194+
```
195+
196+
The output of the Inspektor Gadget `top_process` command resembles the following:
197+
198+
```output
199+
200+
K8S.NODE K8S.NAMESPACE K8S.PODNAME PID COMM MEMORYVIRTUAL MEMORYRSS MEMORYRELATIVE
201+
aks-agentpool-3…901-vmss000001 default memory-stress 21676 stress 944 MB 943 MB 5.6
202+
aks-agentpool-3…901-vmss000001 default memory-stress 21678 stress 944 MB 943 MB 5.6
203+
aks-agentpool-3…901-vmss000001 default memory-stress 21677 stress 944 MB 872 MB 5.2
204+
aks-agentpool-3…901-vmss000001 default memory-stress 21679 stress 944 MB 796 MB 4.8
205+
206+
```
207+
208+
209+
You can use this output to identify the processes that are consuming the most memory on the node. The output can include the node name, namespace, pod name, container name, process ID (PID), command name (COMM), CPU and memory usage, check [the documentation](https://aka.ms/igtopprocess) for more details.
210+
211+
212+
### Step 3: Review best practices to avoid memory saturation
164213

165214
Review the following table to learn how to implement best practices for avoiding memory saturation.
166215

support/azure/azure-kubernetes/error-codes/akscapacityheavyusage-error.md

Lines changed: 0 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -28,8 +28,6 @@ You're trying to create a cluster in a region that has limited capacity.
2828

2929
When you create an AKS cluster, Microsoft Azure allocates compute resources to your subscription. You might occasionally experience the `AksCapacityHeavyUsage` error because of significant growth in demand for Azure Kubernetes Service in specific regions.
3030

31-
The `KubernetesAPICallFailed` error message indicates that the AKS cluster didn't start and doesn't have an associated control plane. Therefore, calls to the API server are failing. In this case, you have to retry the Start operation.
32-
3331
## Resolution
3432

3533
### Solution 1: Select a different region

0 commit comments

Comments
 (0)