LIM/ZD Score Computation Implementation?

Greetings Razvan,

Thank you and your colleagues for the paper [Layer-wise Quantization: A pragmatic and Effictive Method for Quantizing LLMs Beyond Integer Bit-levels](https://arxiv.org/pdf/2406.17415). 

I have two questions:

#### 1
I'm curious if you still planned on releasing any example implementations of computing Layer Input Modification (LIM) Score and Z-score Distribution (ZD) on LLMs such as the presented dense model `Llama-2-13B`? Also wondering if there was any more recent exploration around newer MoE models such as `DeepSeek-R1`/`-V3-0324`?

Bonus if the code works on existing `q8_0` GGUF quants haha...

No pressure if you've already moved on, I have some interest in evaluating these methods given it is becoming easier to specify per layer quantization schemes e.g. [ik_llama.cpp](https://github.com/ikawrakow/ik_llama.cpp/pull/326)'s new `llama-quantize --custom-q` feature.

#### 2

> Our first score, named layer input modification (LIM), is based on how much a layer changes its input representations into the output ones.

Just confirming that I understand correctly that both the LIM and ZD scores are calculated taking all tensors of a given layer into account (e.g. `attn_(v|q|k|output)` and `ffn_(up|down|gate)`) and not just individual tensors?

Thanks!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

LIM/ZD Score Computation Implementation? #2

1

2

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

LIM/ZD Score Computation Implementation? #2

Description

1

2

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions