Skip to content

Commit 1efbd53

Browse files
nv-kmcgill53statirajunvda-mesharma
authored
docs: Update Triton Docs to New Theme (#7891)
Co-authored-by: Suman Tatiraju <167138127+statiraju@users.noreply.github.com> Co-authored-by: Meenakshi Sharma <163925564+nvda-mesharma@users.noreply.github.com> Co-authored-by: Suman Tatiraju <statiraju@nvidia.com>
1 parent 8b6aefa commit 1efbd53

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

44 files changed

+6121
-976
lines changed

docs/Dockerfile.docs

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# Copyright 2022, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
1+
# Copyright 2022-2024, NVIDIA CORPORATION & AFFILIATES. All rights reserved.
22
#
33
# Redistribution and use in source and binary forms, with or without
44
# modification, are permitted provided that the following conditions
@@ -59,6 +59,7 @@ RUN pip3 install \
5959
breathe \
6060
docutils \
6161
exhale \
62+
httplib2 \
6263
ipython \
6364
myst-nb \
6465
nbclient \
@@ -73,6 +74,12 @@ RUN pip3 install \
7374
sphinx-tabs \
7475
sphinxcontrib-bibtex
7576

77+
78+
# install nvidia-sphinx-theme
79+
RUN pip3 install \
80+
--index-url https://urm.nvidia.com/artifactory/api/pypi/ct-omniverse-pypi/simple/ \
81+
nvidia-sphinx-theme
82+
7683
# Set visitor script to be included on every HTML page
7784
ENV VISITS_COUNTING_SCRIPT="//assets.adobedtm.com/b92787824f2e0e9b68dc2e993f9bd995339fe417/satelliteLib-7ba51e58dc61bcb0e9311aadd02a0108ab24cc6c.js"
7885

docs/README.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -124,9 +124,9 @@ Triton supports batching individual inference requests to improve compute resour
124124
- [Queuing Policies](user_guide/model_configuration.md#queue-policy)
125125
- [Ragged Batching](user_guide/ragged_batching.md)
126126
- [Sequence Batcher](user_guide/model_configuration.md#sequence-batcher)
127-
- [Stateful Models](user_guide/architecture.md#stateful-models)
128-
- [Control Inputs](user_guide/architecture.md#control-inputs)
129-
- [Implicit State - Stateful Inference Using a Stateless Model](user_guide/architecture.md#implicit-state-management)
127+
- [Stateful Models](user_guide/model_execution.md#stateful-models)
128+
- [Control Inputs](user_guide/model_execution.md#control-inputs)
129+
- [Implicit State - Stateful Inference Using a Stateless Model](user_guide/implicit_state_management.md#implicit-state-management)
130130
- [Sequence Scheduling Strategies](user_guide/architecture.md#scheduling-strategies)
131131
- [Direct](user_guide/architecture.md#direct)
132132
- [Oldest](user_guide/architecture.md#oldest)

docs/backend_guide/vllm.rst

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
########
2+
vLLM
3+
########
4+
5+
.. toctree::
6+
:hidden:
7+
:caption: vLLM
8+
:maxdepth: 2
9+
10+
../vllm_backend/README
11+
Multi-LoRA <../vllm_backend/docs/llama_multi_lora_tutorial>

docs/client_guide/api_reference.rst

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
####
2+
API Reference
3+
####
4+
5+
.. toctree::
6+
:maxdepth: 1
7+
:hidden:
8+
9+
OpenAI API <openai_readme.md>
10+
kserve

docs/client_guide/in_process.rst

Lines changed: 39 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,39 @@
1+
####
2+
In-Process Triton Server API
3+
####
4+
5+
6+
The Triton Inference Server provides a backwards-compatible C API/ python-bindings/java-bindings that
7+
allows Triton to be linked directly into a C/C++/java/python application. This API
8+
is called the "Triton Server API" or just "Server API" for short. The
9+
API is implemented in the Triton shared library which is built from
10+
source contained in the `core
11+
repository <https://github.com/triton-inference-server/core>`__. On Linux
12+
this library is libtritonserver.so and on Windows it is
13+
tritonserver.dll. In the Triton Docker image the shared library is
14+
found in /opt/tritonserver/lib. The header file that defines and
15+
documents the Server API is
16+
`tritonserver.h <https://github.com/triton-inference-server/core/blob/main/include/triton/core/tritonserver.h>`__.
17+
`Java bindings for In-Process Triton Server API <../customization_guide/inprocess_java_api.html#java-bindings-for-in-process-triton-server-api>`__
18+
are built on top of `tritonserver.h` and can be used for Java applications that
19+
need to use Tritonserver in-process.
20+
21+
All capabilities of Triton server are encapsulated in the shared
22+
library and are exposed via the Server API. The `tritonserver`
23+
executable implements HTTP/REST and GRPC endpoints and uses the Server
24+
API to communicate with core Triton logic. The primary source files
25+
for the endpoints are `grpc_server.cc <https://github.com/triton-inference-server/server/blob/main/src/grpc/grpc_server.cc>`__ and
26+
`http_server.cc <https://github.com/triton-inference-server/server/blob/main/src/http_server.cc>`__. In these source files you can
27+
see the Server API being used.
28+
29+
You can use the Server API in your own application as well. A simple
30+
example using the Server API can be found in
31+
`simple.cc <https://github.com/triton-inference-server/server/blob/main/src/simple.cc>`__.
32+
33+
.. toctree::
34+
:maxdepth: 1
35+
:hidden:
36+
37+
C/C++ <../customization_guide/inprocess_c_api.md>
38+
python
39+
Java <../customization_guide/inprocess_java_api.md>

docs/client_guide/kserve.rst

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
####
2+
KServe API
3+
####
4+
5+
6+
Triton uses the
7+
`KServe community standard inference protocols <https://github.com/kserve/kserve/tree/master/docs/predict-api/v2>`__
8+
to define HTTP/REST and GRPC APIs plus several extensions.
9+
10+
.. toctree::
11+
:maxdepth: 1
12+
:hidden:
13+
14+
HTTP/REST and GRPC Protocol <../customization_guide/inference_protocols.md>
15+
kserve_extension
Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,24 @@
1+
####
2+
Extensions
3+
####
4+
5+
To fully enable all capabilities
6+
Triton also implements `HTTP/REST and GRPC
7+
extensions <https://github.com/triton-inference-server/server/tree/main/docs/protocol>`__
8+
to the KServe inference protocol.
9+
10+
.. toctree::
11+
:maxdepth: 1
12+
:hidden:
13+
14+
Binary tensor data extension <../protocol/extension_binary_data.md>
15+
Classification extension <../protocol/extension_classification.md>
16+
Schedule policy extension <../protocol/extension_schedule_policy.md>
17+
Sequence extension <../protocol/extension_sequence.md>
18+
Shared-memory extension <../protocol/extension_shared_memory.md>
19+
Model configuration extension <../protocol/extension_model_configuration.md>
20+
Model repository extension <../protocol/extension_model_repository.md>
21+
Statistics extension <../protocol/extension_statistics.md>
22+
Trace extension <../protocol/extension_trace.md>
23+
Logging extension <../protocol/extension_logging.md>
24+
Parameters extension <../protocol/extension_parameters.md>

docs/client_guide/openai_readme.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
../../python/openai/README.md

docs/client_guide/python.rst

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
####
2+
Python
3+
####
4+
5+
.. include:: python_readme.rst
6+
7+
.. toctree::
8+
:maxdepth: 1
9+
:hidden:
10+
11+
Kafka I/O <../tutorials/Triton_Inference_Server_Python_API/examples/kafka-io/README.md>
12+
Rayserve <../tutorials/Triton_Inference_Server_Python_API/examples/rayserve/README.md>

0 commit comments

Comments
 (0)