Skip to content

Two quantization-related questions #46

@deepmeng

Description

@deepmeng

Hi @HaohaoNJU , thank you very much for the outstanding work. I have two questions:

(1) The concepts of "explicit quantization (trtexec)" and "implicit quantization (generate_calib_data.py)" mentioned in this work differ from NVIDIA's definitions (PTQ/QAT + Q/DQ layers, PTQ). These both appear to be implicit quantization approaches. Could there be an error?

(2) According to the Metrics , does the performance of "explicit quantization (trtexec) without calibration files" actually surpass that of "implicit quantization with calibration"?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions