Support TensorRT conversion and serving feature

I realized that the Tensorflow Lite does not support inference with using Nvidia GPU. I have a device of Nvidia Jetson Xavier. My current inference is made with unoptimized transformers model on GPU. It is faster than inference with TF Lite model on CPU. 

After my research, I have found 2 types of model optimization such as TensorRT or TF-TRT. I have made some trials to achieve the conversion of fine-tuned transformers model to TensorRT but I could not achieve. It would be better if the dialog-nlu supports TensorRT conversion and serving feature.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Support TensorRT conversion and serving feature #32

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Support TensorRT conversion and serving feature #32

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions