Updated on 2024-11-04

lxaw · lxaw · commit b6aba37e3c83 · 2024-11-04T06:52:13.000-05:00
diff --git a/papers/list.json b/papers/list.json
@@ -1,4 +1,13 @@
 [
+  {
+    "title": "LLM-Pruner: On the Structural Pruning of Large Language Models",
+    "author": "Xinyini Ma et al",
+    "year": "2023",
+    "topic": "llm, structural pruning",
+    "venue": "Arxiv",
+    "description": "The authors introduce LLM-Pruner, a novel approach for compressing large language models that operates in a task-agnostic manner while requiring minimal access to the original training data. Their key insight is to first automatically identify groups of interdependent neural structures within the LLM by analyzing dependency patterns, ensuring that coupled structures are pruned together to maintain model coherence. The method then estimates the importance of these structural groups using both first-order gradients and approximated Hessian information from a small set of calibration samples, allowing them to selectively remove less critical groups while preserving the model's core functionality. Finally, they employ a rapid recovery phase using low-rank adaptation (LoRA) to fine-tune the pruned model with a limited dataset in just a few hours, enabling efficient compression while maintaining the LLM's general-purpose capabilities.",
+    "link": "https://arxiv.org/pdf/2305.11627"
+  },
   {
     "title": "SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models",
     "author": "Guangxuan Xiao et al",
diff --git a/papers_read.html b/papers_read.html
@@ -75,10 +75,10 @@ <h1>Here's where I keep a list of papers I have read.</h1>
         I typically use this to organize papers I found interesting. Please feel free to do whatever you want with it. Note that this is not every single paper I have ever read, just a collection of ones that I remember to put down.
     </p>
     <p id="paperCount">
-        So far, we have read 152 papers. Let's keep it up!
+        So far, we have read 153 papers. Let's keep it up!
     </p> 
     <small id="searchCount">
-        Your search returned 152 papers. Nice! 
+        Your search returned 153 papers. Nice! 
     </small>
     
     <div class="search-inputs">
@@ -105,6 +105,16 @@ <h1>Here's where I keep a list of papers I have read.</h1>
         </thead>
         <tbody>
         
+            <tr>
+                <td>LLM-Pruner: On the Structural Pruning of Large Language Models</td>
+                <td>Xinyini Ma et al</td>
+                <td>2023</td>
+                <td>llm, structural pruning</td>
+                <td>Arxiv</td>
+                <td>The authors introduce LLM-Pruner, a novel approach for compressing large language models that operates in a task-agnostic manner while requiring minimal access to the original training data. Their key insight is to first automatically identify groups of interdependent neural structures within the LLM by analyzing dependency patterns, ensuring that coupled structures are pruned together to maintain model coherence. The method then estimates the importance of these structural groups using both first-order gradients and approximated Hessian information from a small set of calibration samples, allowing them to selectively remove less critical groups while preserving the model&#x27;s core functionality. Finally, they employ a rapid recovery phase using low-rank adaptation (LoRA) to fine-tune the pruned model with a limited dataset in just a few hours, enabling efficient compression while maintaining the LLM&#x27;s general-purpose capabilities.</td>
+                <td><a href="https://arxiv.org/pdf/2305.11627" target="_blank">Link</a></td>
+            </tr>
+        
             <tr>
                 <td>SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models</td>
                 <td>Guangxuan Xiao et al</td>

Original file line number	Diff line number	Diff line change
`@@ -1,4 +1,13 @@`
`1`	`1`	`[`
	`2`	`+ {`
	`3`	`+ "title": "LLM-Pruner: On the Structural Pruning of Large Language Models",`
	`4`	`+ "author": "Xinyini Ma et al",`
	`5`	`+ "year": "2023",`
	`6`	`+ "topic": "llm, structural pruning",`
	`7`	`+ "venue": "Arxiv",`
	`8`	+ "description": "The authors introduce LLM-Pruner, a novel approach for compressing large language models that operates in a task-agnostic manner while requiring minimal access to the original training data. Their key insight is to first automatically identify groups of interdependent neural structures within the LLM by analyzing dependency patterns, ensuring that coupled structures are pruned together to maintain model coherence. The method then estimates the importance of these structural groups using both first-order gradients and approximated Hessian information from a small set of calibration samples, allowing them to selectively remove less critical groups while preserving the model's core functionality. Finally, they employ a rapid recovery phase using low-rank adaptation (LoRA) to fine-tune the pruned model with a limited dataset in just a few hours, enabling efficient compression while maintaining the LLM's general-purpose capabilities.",
	`9`	`+ "link": "https://arxiv.org/pdf/2305.11627"`
	`10`	`+ },`
`2`	`11`	`{`
`3`	`12`	`"title": "SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models",`
`4`	`13`	`"author": "Guangxuan Xiao et al",`