You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
## Overview of this deployment pattern on Amazon EKS
34
36
35
37
This pattern combines the capabilities of NVIDIA NIM, Amazon Elastic Kubernetes Service (EKS), and various AWS services to deliver a high-performance and cost-optimized model serving infrastructure.
@@ -52,6 +54,9 @@ By combining these components, our proposed solution delivers a powerful and cos
52
54
53
55
Before getting started with NVIDIA NIM, ensure you have the following:
54
56
57
+
<details>
58
+
<summary>Click to expand the NVIDIA NIM account setup details</summary>
59
+
55
60
**NVIDIA AI Enterprise Account**
56
61
57
62
- Register for an NVIDIA AI Enterprise account. If you don't have one, you can sign up for a trial account using this [link](https://enterpriseproductregistration.nvidia.com/?LicType=EVAL&ProductFamily=NVAIEnterprise).
Once the pod is ready with running status `1/1`, can execute into the pod.
328
+
322
329
```bash
323
330
export POD_NAME=$(kubectl get po -l app=tritonserver -ojsonpath='{.items[0].metadata.name}')
324
331
kubectl exec -it $POD_NAME -- bash
325
332
```
333
+
326
334
Run the testing to the deployed NIM Llama3 model
335
+
327
336
```bash
328
337
genai-perf \
329
338
-m meta/llama3-8b-instruct \
@@ -342,6 +351,7 @@ genai-perf \
342
351
--profile-export-file my_profile_export.json \
343
352
--url nim-llm.nim:8000
344
353
```
354
+
345
355
You should see similar output like the following
346
356
347
357
```bash
@@ -362,20 +372,19 @@ You should be able to see the [metrics](https://docs.nvidia.com/deeplearning/tri
362
372
363
373
To understand the command line options, please refer to [this documentation](https://docs.nvidia.com/deeplearning/triton-inference-server/user-guide/docs/client/src/c%2B%2B/perf_analyzer/genai-perf/README.html#command-line-options).
364
374
365
-
366
375
## Observability
376
+
367
377
As part of this blueprint, we have also deployed the Kube Prometheus stack, which provides Prometheus server and Grafana deployments for monitoring and observability.
368
378
369
379
First, let's verify the services deployed by the Kube Prometheus stack:
- Open your web browser and navigate to [http://localhost:3000](http://localhost:3000).
426
448
- Login with the username `admin` and the password retrieved from AWS Secrets Manager.
427
449
428
-
**Open the NIM Monitoring Dashboard:**
450
+
**4. Open the NIM Monitoring Dashboard:**
429
451
430
452
- Once logged in, click "Dashboards" on the left sidebar and search "nim"
431
453
- You can find the Dashboard `NVIDIA NIM Monitoring` from the list
432
454
- Click and entering to the dashboard.
433
455
434
456
You should now see the metrics displayed on the Grafana dashboard, allowing you to monitor the performance your NVIDIA NIM service deployment.
457
+
</details>
458
+
459
+
:::info
460
+
As of writing this guide, NVIDIA also provides an example Grafana dashboard. You can check it from [here](https://docs.nvidia.com/nim/large-language-models/latest/observability.html#grafana).
0 commit comments