Updated on 2024-12-12

lxaw · lxaw · commit 35c10c15cbe4 · 2024-12-12T08:29:58.000+09:00
diff --git a/index.html b/index.html
@@ -74,7 +74,7 @@ <h1>Where?</h1>
         </p>
         <h1>When?</h1>
         <p>
-        Last time this was edited was 2024-12-07 (YYYY/MM/DD).
+        Last time this was edited was 2024-12-12 (YYYY/MM/DD).
         </p>
         <small><a href="misc.html">misc</a></small>
     </div>
diff --git a/papers/list.json b/papers/list.json
@@ -1,4 +1,13 @@
 [
+  {
+    "title": "Language Models are Super Mario: Absorbing Abilities from Homologous Models as a Free Lunch",
+    "author": "Le Yu et al",
+    "year": "2024",
+    "topic": "model merging",
+    "venue": "ICML",
+    "description": "This paper shows that language models (LMs) can get new abilities via assimilating params from homologous models. They also note that LMs after Supervised Fine-Tuning (SFT) have many redundant delta parameters (i.e, the alteration of the model params before and after SFT). They then present DARE (Drop And REscale) as a means of setting delta parameters to zero with drop rate of p and then rescaling the remaining ones by a factor of 1/(1-p). They then use DARE to remove redundant delta parameters in each model prior to merging, which they find can help mitigate the interference of params among multiple models. Then they use standard model merging techniqes to merge the models.",
+    "link": "https://arxiv.org/pdf/2311.03099"
+  },
   {
     "title": "Training-Free Pretrained Model Merging",
     "author": "Zhengqi Xu et al",
diff --git a/papers_read.html b/papers_read.html
@@ -75,10 +75,10 @@ <h1>Here's where I keep a list of papers I have read.</h1>
         I typically use this to organize papers I found interesting. Please feel free to do whatever you want with it. Note that this is not every single paper I have ever read, just a collection of ones that I remember to put down.
     </p>
     <p id="paperCount">
-        So far, we have read 186 papers. Let's keep it up!
+        So far, we have read 187 papers. Let's keep it up!
     </p> 
     <small id="searchCount">
-        Your search returned 186 papers. Nice! 
+        Your search returned 187 papers. Nice! 
     </small>
     
     <div class="search-inputs">
@@ -105,6 +105,16 @@ <h1>Here's where I keep a list of papers I have read.</h1>
         </thead>
         <tbody>
         
+            <tr>
+                <td>Language Models are Super Mario: Absorbing Abilities from Homologous Models as a Free Lunch</td>
+                <td>Le Yu et al</td>
+                <td>2024</td>
+                <td>model merging</td>
+                <td>ICML</td>
+                <td>This paper shows that language models (LMs) can get new abilities via assimilating params from homologous models. They also note that LMs after Supervised Fine-Tuning (SFT) have many redundant delta parameters (i.e, the alteration of the model params before and after SFT). They then present DARE (Drop And REscale) as a means of setting delta parameters to zero with drop rate of p and then rescaling the remaining ones by a factor of 1/(1-p). They then use DARE to remove redundant delta parameters in each model prior to merging, which they find can help mitigate the interference of params among multiple models. Then they use standard model merging techniqes to merge the models.</td>
+                <td><a href="https://arxiv.org/pdf/2311.03099" target="_blank">Link</a></td>
+            </tr>
+        
             <tr>
                 <td>Training-Free Pretrained Model Merging</td>
                 <td>Zhengqi Xu et al</td>

Original file line number	Diff line number	Diff line change
`@@ -1,4 +1,13 @@`
`1`	`1`	`[`
	`2`	`+ {`
	`3`	`+ "title": "Language Models are Super Mario: Absorbing Abilities from Homologous Models as a Free Lunch",`
	`4`	`+ "author": "Le Yu et al",`
	`5`	`+ "year": "2024",`
	`6`	`+ "topic": "model merging",`
	`7`	`+ "venue": "ICML",`
	`8`	+ "description": "This paper shows that language models (LMs) can get new abilities via assimilating params from homologous models. They also note that LMs after Supervised Fine-Tuning (SFT) have many redundant delta parameters (i.e, the alteration of the model params before and after SFT). They then present DARE (Drop And REscale) as a means of setting delta parameters to zero with drop rate of p and then rescaling the remaining ones by a factor of 1/(1-p). They then use DARE to remove redundant delta parameters in each model prior to merging, which they find can help mitigate the interference of params among multiple models. Then they use standard model merging techniqes to merge the models.",
	`9`	`+ "link": "https://arxiv.org/pdf/2311.03099"`
	`10`	`+ },`
`2`	`11`	`{`
`3`	`12`	`"title": "Training-Free Pretrained Model Merging",`
`4`	`13`	`"author": "Zhengqi Xu et al",`