Skip to content

Commit 3578e79

Browse files
authored
[Model] Keep vision encoder weights unquantized to maintain accuracy (#3028)
This PR ensures that vision encoder layers are excluded from quantization, improving accuracy for models with vision components.
1 parent 967fb76 commit 3578e79

File tree

2 files changed

+5
-0
lines changed

2 files changed

+5
-0
lines changed

python/mlc_llm/model/vision/clip_vision.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -218,6 +218,8 @@ def forward(self, pixel_values: Tensor) -> Tensor:
218218

219219

220220
class CLIPVisionModel(Module):
221+
no_quantization: bool = True
222+
221223
def __init__(self, config: CLIPVisionConfig):
222224
super().__init__()
223225
self.vision_model = CLIPVisionTransformer(config)

python/mlc_llm/quantization/group_quantization.py

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -111,6 +111,9 @@ def visit_module(self, name: str, node: nn.Module) -> Any:
111111
ret_node: Any
112112
The new node to replace current node.
113113
"""
114+
if getattr(node, "no_quantization", False):
115+
return node
116+
114117
if (
115118
isinstance(node, nn.Linear)
116119
and (not is_final_fc(name) or self.config.quantize_final_fc)

0 commit comments

Comments
 (0)