Two quantization-related questions

Hi @HaohaoNJU , thank you very much for the outstanding work. I have two questions:

(1) The concepts of "explicit quantization (trtexec)" and "implicit quantization (generate_calib_data.py)" mentioned in this work differ from NVIDIA's definitions (PTQ/QAT + Q/DQ layers, PTQ). These both appear to be implicit quantization approaches. Could there be an error?

(2) According to the Metrics , does the performance of "explicit quantization (trtexec) without calibration files" actually surpass that of "implicit quantization with calibration"?



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Two quantization-related questions #46

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Two quantization-related questions #46

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions