Skip to content

Commit b37bcdb

Browse files
committed
Updated on 2024-11-11
1 parent f745446 commit b37bcdb

File tree

2 files changed

+21
-2
lines changed

2 files changed

+21
-2
lines changed

papers/list.json

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,13 @@
11
[
2+
{
3+
"title": "SVDQuant: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion",
4+
"author": "Muyang Li et al",
5+
"year": "2024",
6+
"topic": "quantization, diffusion",
7+
"venue": "Arxiv",
8+
"description": "SVDQuant introduces a novel approach to 4-bit quantization of diffusion models by using a low-rank branch to absorb outliers in both weights and activations, making quantization more feasible at such aggressive bit reduction. The method first consolidates outliers from activations to weights through smoothing, then decomposes the weights using Singular Value Decomposition (SVD) to separate the dominant components into a 16-bit low-rank branch while keeping the residual in 4 bits. To make this practical, they developed an inference engine called Nunchaku that fuses the low-rank and low-bit branch kernels together, eliminating redundant memory access that would otherwise negate the performance benefits. The approach is designed to work across different diffusion model architectures and can seamlessly integrate with existing low-rank adapters (LoRAs) without requiring re-quantization.",
9+
"link": "https://arxiv.org/pdf/2411.05007"
10+
},
211
{
312
"title": "One Weight Bitwidth to Rule Them All",
413
"author": "Ting-Wu Chin et al",

papers_read.html

Lines changed: 12 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -75,10 +75,10 @@ <h1>Here's where I keep a list of papers I have read.</h1>
7575
I typically use this to organize papers I found interesting. Please feel free to do whatever you want with it. Note that this is not every single paper I have ever read, just a collection of ones that I remember to put down.
7676
</p>
7777
<p id="paperCount">
78-
So far, we have read 164 papers. Let's keep it up!
78+
So far, we have read 165 papers. Let's keep it up!
7979
</p>
8080
<small id="searchCount">
81-
Your search returned 164 papers. Nice!
81+
Your search returned 165 papers. Nice!
8282
</small>
8383

8484
<div class="search-inputs">
@@ -105,6 +105,16 @@ <h1>Here's where I keep a list of papers I have read.</h1>
105105
</thead>
106106
<tbody>
107107

108+
<tr>
109+
<td>SVDQuant: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion</td>
110+
<td>Muyang Li et al</td>
111+
<td>2024</td>
112+
<td>quantization, diffusion</td>
113+
<td>Arxiv</td>
114+
<td>SVDQuant introduces a novel approach to 4-bit quantization of diffusion models by using a low-rank branch to absorb outliers in both weights and activations, making quantization more feasible at such aggressive bit reduction. The method first consolidates outliers from activations to weights through smoothing, then decomposes the weights using Singular Value Decomposition (SVD) to separate the dominant components into a 16-bit low-rank branch while keeping the residual in 4 bits. To make this practical, they developed an inference engine called Nunchaku that fuses the low-rank and low-bit branch kernels together, eliminating redundant memory access that would otherwise negate the performance benefits. The approach is designed to work across different diffusion model architectures and can seamlessly integrate with existing low-rank adapters (LoRAs) without requiring re-quantization.</td>
115+
<td><a href="https://arxiv.org/pdf/2411.05007" target="_blank">Link</a></td>
116+
</tr>
117+
108118
<tr>
109119
<td>One Weight Bitwidth to Rule Them All</td>
110120
<td>Ting-Wu Chin et al</td>

0 commit comments

Comments
 (0)