Skip to content

Commit 92bc457

Browse files
mc-nvyinggeh
andauthored
feat: Add vLLM counter metrics access through Triton (#7493) (#7546)
Co-authored-by: Yingge He <157551214+yinggeh@users.noreply.github.com>
1 parent 8587c47 commit 92bc457

File tree

2 files changed

+10
-0
lines changed

2 files changed

+10
-0
lines changed

build.py

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1806,6 +1806,10 @@ def backend_clone(
18061806
os.path.join(build_dir, be, "src", "model.py"),
18071807
backend_dir,
18081808
)
1809+
clone_script.cpdir(
1810+
os.path.join(build_dir, be, "src", "utils"),
1811+
backend_dir,
1812+
)
18091813

18101814
clone_script.comment()
18111815
clone_script.comment(f"end '{be}' backend")

docs/user_guide/metrics.md

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -378,3 +378,9 @@ Further documentation can be found in the `TRITONSERVER_MetricFamily*` and
378378
The TRT-LLM backend uses the custom metrics API to track and expose specific metrics about
379379
LLMs, KV Cache, and Inflight Batching to Triton:
380380
https://github.com/triton-inference-server/tensorrtllm_backend?tab=readme-ov-file#triton-metrics
381+
382+
### vLLM Backend Metrics
383+
384+
The vLLM backend uses the custom metrics API to track and expose specific metrics about
385+
LLMs to Triton:
386+
https://github.com/triton-inference-server/vllm_backend?tab=readme-ov-file#triton-metrics

0 commit comments

Comments
 (0)