-
Notifications
You must be signed in to change notification settings - Fork 57
Open
Description
Hi @HaohaoNJU , thank you very much for the outstanding work. I have two questions:
(1) The concepts of "explicit quantization (trtexec)" and "implicit quantization (generate_calib_data.py)" mentioned in this work differ from NVIDIA's definitions (PTQ/QAT + Q/DQ layers, PTQ). These both appear to be implicit quantization approaches. Could there be an error?
(2) According to the Metrics , does the performance of "explicit quantization (trtexec) without calibration files" actually surpass that of "implicit quantization with calibration"?
Metadata
Metadata
Assignees
Labels
No labels