Skip to content

Commit b6aba37

Browse files
committed
Updated on 2024-11-04
1 parent cc54abc commit b6aba37

File tree

2 files changed

+21
-2
lines changed

2 files changed

+21
-2
lines changed

papers/list.json

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,13 @@
11
[
2+
{
3+
"title": "LLM-Pruner: On the Structural Pruning of Large Language Models",
4+
"author": "Xinyini Ma et al",
5+
"year": "2023",
6+
"topic": "llm, structural pruning",
7+
"venue": "Arxiv",
8+
"description": "The authors introduce LLM-Pruner, a novel approach for compressing large language models that operates in a task-agnostic manner while requiring minimal access to the original training data. Their key insight is to first automatically identify groups of interdependent neural structures within the LLM by analyzing dependency patterns, ensuring that coupled structures are pruned together to maintain model coherence. The method then estimates the importance of these structural groups using both first-order gradients and approximated Hessian information from a small set of calibration samples, allowing them to selectively remove less critical groups while preserving the model's core functionality. Finally, they employ a rapid recovery phase using low-rank adaptation (LoRA) to fine-tune the pruned model with a limited dataset in just a few hours, enabling efficient compression while maintaining the LLM's general-purpose capabilities.",
9+
"link": "https://arxiv.org/pdf/2305.11627"
10+
},
211
{
312
"title": "SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models",
413
"author": "Guangxuan Xiao et al",

papers_read.html

Lines changed: 12 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -75,10 +75,10 @@ <h1>Here's where I keep a list of papers I have read.</h1>
7575
I typically use this to organize papers I found interesting. Please feel free to do whatever you want with it. Note that this is not every single paper I have ever read, just a collection of ones that I remember to put down.
7676
</p>
7777
<p id="paperCount">
78-
So far, we have read 152 papers. Let's keep it up!
78+
So far, we have read 153 papers. Let's keep it up!
7979
</p>
8080
<small id="searchCount">
81-
Your search returned 152 papers. Nice!
81+
Your search returned 153 papers. Nice!
8282
</small>
8383

8484
<div class="search-inputs">
@@ -105,6 +105,16 @@ <h1>Here's where I keep a list of papers I have read.</h1>
105105
</thead>
106106
<tbody>
107107

108+
<tr>
109+
<td>LLM-Pruner: On the Structural Pruning of Large Language Models</td>
110+
<td>Xinyini Ma et al</td>
111+
<td>2023</td>
112+
<td>llm, structural pruning</td>
113+
<td>Arxiv</td>
114+
<td>The authors introduce LLM-Pruner, a novel approach for compressing large language models that operates in a task-agnostic manner while requiring minimal access to the original training data. Their key insight is to first automatically identify groups of interdependent neural structures within the LLM by analyzing dependency patterns, ensuring that coupled structures are pruned together to maintain model coherence. The method then estimates the importance of these structural groups using both first-order gradients and approximated Hessian information from a small set of calibration samples, allowing them to selectively remove less critical groups while preserving the model&#x27;s core functionality. Finally, they employ a rapid recovery phase using low-rank adaptation (LoRA) to fine-tune the pruned model with a limited dataset in just a few hours, enabling efficient compression while maintaining the LLM&#x27;s general-purpose capabilities.</td>
115+
<td><a href="https://arxiv.org/pdf/2305.11627" target="_blank">Link</a></td>
116+
</tr>
117+
108118
<tr>
109119
<td>SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models</td>
110120
<td>Guangxuan Xiao et al</td>

0 commit comments

Comments
 (0)