TIS IGIE Backend

TIS IGIE Backend，即Triton Inference Server IGIE Backend，该推理后端是在天数IGIE环境下，实现的一套对接TIS服务的推理后端。该后端在开发过程中，在概念、接口、实现方法等，尽量地遵循TIS软件生态。

下面对TIS IGIE Backend架构进行介绍，为了更好地理解以下内容，建议先通读triton-inference-server仓库下相关docs。

IGIE Backend for Triton Inference Server

下图介绍了Triton Inference Server的整体架构，server端在启动时，会检测部署后的模型仓库，同时检测到当前配置的backend，根据各模型的配置文件，调用对应的backend完成模型加载。client端可以向server端发送推理请求(server端提供3种接口可供使用，分别是HTTP/gRPC/C API)，server端解析请求数据包，进行内部调度，将不同推理请求任务对应到不同模型上，根据模型配置文件，决定调用哪些硬件(CPU/GPU)进行推理，推理完成后将结果以一定格式打包返回给client端。

而IGIE Backend就是针对天数GPU的一套backend实现。

Why TIS IGIE Backend?

TIS IGIE Backend支持以下特性，后期根据实际应用需要，会逐步添加其它特性。

支持多client并发推理请求
支持多卡多模型
支持支持dynamic-batch model和non-batch model
支持多输入多输出模型

How to Play?

TIS IGIE Backend的使用比较简单，我们可以使用triton提供的httpclient接口，向server发送模型推理请求，核心代码为:

model_name = "mobilenet_v3"
inputs = [ httpclient.InferInput(input_name, shape, "FP32")]
inputs[0].set_data_from_numpy(input_data)

response = triton_client.infer(model_name, inputs)

result_meta = response.get_response()

更多模型示例，参见play_TIS_from_scratch.md开箱测试章节。

FAQs

1.batch model VS non-batch model?

TIS IGIE Backend同时支持batch model和non-batch model，对于后者，用户需要在模型config.pbtxt中设置max_batch_size=0。

有这样的场景，模型的多个输入的第一个维度并不是一样的，比如上面的CLIP，其pixel_values输入可以看作是32batch数据打包在一起，但input_ids的维度却是[1000, 22]，所以不能简单地认为CLIP模型的输入是32batch的，对于这样的模型，我们把它视为non-batch model来处理，在config.pbtxt中设置max_batch_size=0，同时，将每个输入具体的shape填到对应的dims里。

而像ResNet50模型，输入只有一个，且第一维度作为batch大小，可以是8，16，32等，这时，我们可以在其config.pbtxt中设置一下最大支持的batch，比如max_batch_size=32，在部署ResNet50时，也是按最大batch=32来生成*so文件的。当client每次请求的数据包小于max_batch_size时，不够的地方进行padding补零，保证模型可以正常推理，server返回数据包时，截断补零输入的推理结果，只保留client实际提交的batchsize那部分即可。

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
cmake		cmake
docs		docs
examples		examples
figure		figure
src		src
.clang-format		.clang-format
.gitignore		.gitignore
CMakeLists.txt		CMakeLists.txt
LICENSE		LICENSE
README.md		README.md
build.sh		build.sh
features.md		features.md
install.sh		install.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

TIS IGIE Backend

IGIE Backend for Triton Inference Server

Why TIS IGIE Backend?

How to Play?

FAQs

1.batch model VS non-batch model?

2.more questions you can put here

About

Uh oh!

Languages

License

Deep-Spark/tis-igie-backend

Folders and files

Latest commit

History

Repository files navigation

TIS IGIE Backend

IGIE Backend for Triton Inference Server

Why TIS IGIE Backend?

How to Play?

FAQs

1.batch model VS non-batch model?

2.more questions you can put here

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Languages