-
Notifications
You must be signed in to change notification settings - Fork 40
Open
Labels
featureRequest for new featureRequest for new feature
Description
I realized that the Tensorflow Lite does not support inference with using Nvidia GPU. I have a device of Nvidia Jetson Xavier. My current inference is made with unoptimized transformers model on GPU. It is faster than inference with TF Lite model on CPU.
After my research, I have found 2 types of model optimization such as TensorRT or TF-TRT. I have made some trials to achieve the conversion of fine-tuned transformers model to TensorRT but I could not achieve. It would be better if the dialog-nlu supports TensorRT conversion and serving feature.
Metadata
Metadata
Assignees
Labels
featureRequest for new featureRequest for new feature