Skip to content

Commit cc54abc

Browse files
committed
Updated on 2024-11-04
1 parent bd810ba commit cc54abc

File tree

2 files changed

+21
-2
lines changed

2 files changed

+21
-2
lines changed

papers/list.json

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,13 @@
11
[
2+
{
3+
"title": "SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models",
4+
"author": "Guangxuan Xiao et al",
5+
"year": "2023",
6+
"topic": "llm, quantization, activations",
7+
"venue": "ICML",
8+
"description": "The key insight of SmoothQuant is that in large language models, while weights are relatively easy to quantize, activations are much harder due to outliers. They observed that these outliers persistently appear in specific channels across different tokens, suggesting that the difficulty could be redistributed. Their solution is to mathematically transform the model by scaling down problematic activation channels while scaling up the corresponding weight channels proportionally, which maintains mathematical equivalence while making both weights and activations easier to quantize. This \"difficulty migration\" approach allows them to balance the quantization challenges between weights and activations using a tunable parameter α, rather than having all the difficulty concentrated in the activation values.",
9+
"link": "https://arxiv.org/pdf/2211.10438"
10+
},
211
{
312
"title": "ESPACE: Dimensionality Reduction of Activations for Model Compression",
413
"author": "Charbel Sakr et al",

papers_read.html

Lines changed: 12 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -75,10 +75,10 @@ <h1>Here's where I keep a list of papers I have read.</h1>
7575
I typically use this to organize papers I found interesting. Please feel free to do whatever you want with it. Note that this is not every single paper I have ever read, just a collection of ones that I remember to put down.
7676
</p>
7777
<p id="paperCount">
78-
So far, we have read 151 papers. Let's keep it up!
78+
So far, we have read 152 papers. Let's keep it up!
7979
</p>
8080
<small id="searchCount">
81-
Your search returned 151 papers. Nice!
81+
Your search returned 152 papers. Nice!
8282
</small>
8383

8484
<div class="search-inputs">
@@ -105,6 +105,16 @@ <h1>Here's where I keep a list of papers I have read.</h1>
105105
</thead>
106106
<tbody>
107107

108+
<tr>
109+
<td>SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models</td>
110+
<td>Guangxuan Xiao et al</td>
111+
<td>2023</td>
112+
<td>llm, quantization, activations</td>
113+
<td>ICML</td>
114+
<td>The key insight of SmoothQuant is that in large language models, while weights are relatively easy to quantize, activations are much harder due to outliers. They observed that these outliers persistently appear in specific channels across different tokens, suggesting that the difficulty could be redistributed. Their solution is to mathematically transform the model by scaling down problematic activation channels while scaling up the corresponding weight channels proportionally, which maintains mathematical equivalence while making both weights and activations easier to quantize. This &quot;difficulty migration&quot; approach allows them to balance the quantization challenges between weights and activations using a tunable parameter α, rather than having all the difficulty concentrated in the activation values.</td>
115+
<td><a href="https://arxiv.org/pdf/2211.10438" target="_blank">Link</a></td>
116+
</tr>
117+
108118
<tr>
109119
<td>ESPACE: Dimensionality Reduction of Activations for Model Compression</td>
110120
<td>Charbel Sakr et al</td>

0 commit comments

Comments
 (0)