Skip to content

Support TensorRT conversion and serving feature #32

@redrussianarmy

Description

@redrussianarmy

I realized that the Tensorflow Lite does not support inference with using Nvidia GPU. I have a device of Nvidia Jetson Xavier. My current inference is made with unoptimized transformers model on GPU. It is faster than inference with TF Lite model on CPU.

After my research, I have found 2 types of model optimization such as TensorRT or TF-TRT. I have made some trials to achieve the conversion of fine-tuned transformers model to TensorRT but I could not achieve. It would be better if the dialog-nlu supports TensorRT conversion and serving feature.

Metadata

Metadata

Assignees

Labels

featureRequest for new feature

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions