You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: support/azure/azure-kubernetes/availability-performance/identify-memory-saturation-aks.md
+54-5Lines changed: 54 additions & 5 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,7 +1,7 @@
1
1
---
2
2
title: Troubleshoot memory saturation in AKS clusters
3
3
description: Troubleshoot memory saturation in Azure Kubernetes Service (AKS) clusters across namespaces and containers. Learn how to identify the hosting node.
4
-
ms.date: 08/30/2024
4
+
ms.date: 06/27/2025
5
5
editor: v-jsitser
6
6
ms.reviewer: chiragpa, aritraghosh, v-leedennis
7
7
ms.service: azure-kubernetes-service
@@ -14,6 +14,7 @@ This article discusses methods for troubleshooting memory saturation issues. Mem
14
14
## Prerequisites
15
15
16
16
- The Kubernetes [kubectl](https://kubernetes.io/docs/reference/kubectl/overview/) command-line tool. To install kubectl by using [Azure CLI](/cli/azure/install-azure-cli), run the [az aks install-cli](/cli/azure/aks#az-aks-install-cli) command.
17
+
- The open source project [Inspektor Gadget](/troubleshoot/azure/azure-kubernetes/logs/capture-system-insights-from-aks#what-is-inspektor-gadget) for advanced process level memory analysis. For more information, see [How to install Inspektor Gadget in an AKS cluster](/troubleshoot/azure/azure-kubernetes/logs/capture-system-insights-from-aks#how-to-install-inspektor-gadget-in-an-aks-cluster).
17
18
18
19
## Symptoms
19
20
@@ -88,7 +89,7 @@ This procedure uses the kubectl commands in a console. It displays only the curr
1. Get the list of pods that are running on the node and their memory usage by running the [kubectl get pods](https://kubernetes.io/docs/reference/generated/kubectl/kubectl-commands#get) and [kubectl top pods](https://kubernetes.io/docs/reference/generated/kubectl/kubectl-commands#-em-pod-em-) commands:
92
+
2. Get the list of pods that are running on the node and their memory usage by running the [kubectl get pods](https://kubernetes.io/docs/reference/generated/kubectl/kubectl-commands#get) and [kubectl top pods](https://kubernetes.io/docs/reference/generated/kubectl/kubectl-commands#-em-pod-em-) commands:
92
93
93
94
```bash
94
95
kubectl get pods --all-namespaces --output wide \
@@ -125,7 +126,7 @@ This procedure uses the kubectl commands in a console. It displays only the curr
125
126
ama-logs-w5bmd 12m 403Mi
126
127
```
127
128
128
-
1. Review the requests and limits for each pod on the node by running the [kubectl describe node](https://kubernetes.io/docs/reference/generated/kubectl/kubectl-commands#describe) command:
129
+
3. Review the requests and limits for each pod on the node by running the [kubectl describe node](https://kubernetes.io/docs/reference/generated/kubectl/kubectl-commands#describe) command:
129
130
130
131
```bash
131
132
kubectl describe node <node-name>
@@ -158,9 +159,57 @@ This procedure uses the kubectl commands in a console. It displays only the curr
158
159
159
160
---
160
161
161
-
Now that you've identified the pods that are using high memory, you can identify the applications that are running on the pod.
162
+
Now that you've identified the pods that are using high memory, you can identify the applications that are running on the pod or identify processes that may be consuming excess memory.
162
163
163
-
### Step 2: Review best practices to avoid memory saturation
164
+
### Step 2: Identify process level memory usage
165
+
166
+
For advanced process level memory analysis, use [Inspektor Gadget](https://go.microsoft.com/fwlink/?linkid=2260072) to monitor real time memory usage at the process level within pods:
167
+
168
+
1. Install Inspektor Gadget using the instructions found in the [documentation](/troubleshoot/azure/azure-kubernetes/logs/capture-system-insights-from-aks#how-to-install-inspektor-gadget-in-an-aks-cluster)
169
+
170
+
2. Run the [top_process gadget](https://aka.ms/igtopprocess) to identify processes that are using large amounts of memory. You can use `--fields` to select certain columns and `--filter` to filter events based on specific field values, for example the pod names of previously identified pods with high memory consumption. You can also:
171
+
172
+
- Identify top 10 memory-consuming processes across the cluster:
173
+
174
+
```bash
175
+
kubectl gadget run top_process --sort -memoryRelative --max-entries 10
176
+
```
177
+
178
+
- Identify top memory-consuming processes on a specific node:
179
+
180
+
```bash
181
+
kubectl gadget run top_process --sort -memoryRelative --filter k8s.node==<node-name>
182
+
```
183
+
184
+
- Identify top memory-consuming processes in a specific namespace:
185
+
186
+
```bash
187
+
kubectl gadget run top_process --sort -memoryRelative --filter k8s.namespace==<namespace>
188
+
```
189
+
190
+
- Identify top memory-consuming processes in a specific pod:
191
+
192
+
```bash
193
+
kubectl gadget run top_process --sort -memoryRelative --filter k8s.podName==<pod-name>
194
+
```
195
+
196
+
The output of the Inspektor Gadget `top_process`command resembles the following:
197
+
198
+
```output
199
+
200
+
K8S.NODE K8S.NAMESPACE K8S.PODNAME PID COMM MEMORYVIRTUAL MEMORYRSS MEMORYRELATIVE
You can use this output to identify the processes that are consuming the most memory on the node. The output can include the node name, namespace, pod name, container name, process ID (PID), command name (COMM), CPU and memory usage, check [the documentation](https://aka.ms/igtopprocess) for more details.
210
+
211
+
212
+
### Step 3: Review best practices to avoid memory saturation
164
213
165
214
Review the following table to learn how to implement best practices for avoiding memory saturation.
Copy file name to clipboardExpand all lines: support/azure/azure-kubernetes/error-codes/akscapacityheavyusage-error.md
-2Lines changed: 0 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -28,8 +28,6 @@ You're trying to create a cluster in a region that has limited capacity.
28
28
29
29
When you create an AKS cluster, Microsoft Azure allocates compute resources to your subscription. You might occasionally experience the `AksCapacityHeavyUsage` error because of significant growth in demand for Azure Kubernetes Service in specific regions.
30
30
31
-
The `KubernetesAPICallFailed` error message indicates that the AKS cluster didn't start and doesn't have an associated control plane. Therefore, calls to the API server are failing. In this case, you have to retry the Start operation.
0 commit comments