Skip to content

LIM/ZD Score Computation Implementation? #2

@ubergarm

Description

@ubergarm

Greetings Razvan,

Thank you and your colleagues for the paper Layer-wise Quantization: A pragmatic and Effictive Method for Quantizing LLMs Beyond Integer Bit-levels.

I have two questions:

1

I'm curious if you still planned on releasing any example implementations of computing Layer Input Modification (LIM) Score and Z-score Distribution (ZD) on LLMs such as the presented dense model Llama-2-13B? Also wondering if there was any more recent exploration around newer MoE models such as DeepSeek-R1/-V3-0324?

Bonus if the code works on existing q8_0 GGUF quants haha...

No pressure if you've already moved on, I have some interest in evaluating these methods given it is becoming easier to specify per layer quantization schemes e.g. ik_llama.cpp's new llama-quantize --custom-q feature.

2

Our first score, named layer input modification (LIM), is based on how much a layer changes its input representations into the output ones.

Just confirming that I understand correctly that both the LIM and ZD scores are calculated taking all tensors of a given layer into account (e.g. attn_(v|q|k|output) and ffn_(up|down|gate)) and not just individual tensors?

Thanks!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions