Skip to content

Commit 35c10c1

Browse files
committed
Updated on 2024-12-12
1 parent 3894659 commit 35c10c1

File tree

3 files changed

+22
-3
lines changed

3 files changed

+22
-3
lines changed

index.html

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -74,7 +74,7 @@ <h1>Where?</h1>
7474
</p>
7575
<h1>When?</h1>
7676
<p>
77-
Last time this was edited was 2024-12-07 (YYYY/MM/DD).
77+
Last time this was edited was 2024-12-12 (YYYY/MM/DD).
7878
</p>
7979
<small><a href="misc.html">misc</a></small>
8080
</div>

papers/list.json

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,13 @@
11
[
2+
{
3+
"title": "Language Models are Super Mario: Absorbing Abilities from Homologous Models as a Free Lunch",
4+
"author": "Le Yu et al",
5+
"year": "2024",
6+
"topic": "model merging",
7+
"venue": "ICML",
8+
"description": "This paper shows that language models (LMs) can get new abilities via assimilating params from homologous models. They also note that LMs after Supervised Fine-Tuning (SFT) have many redundant delta parameters (i.e, the alteration of the model params before and after SFT). They then present DARE (Drop And REscale) as a means of setting delta parameters to zero with drop rate of p and then rescaling the remaining ones by a factor of 1/(1-p). They then use DARE to remove redundant delta parameters in each model prior to merging, which they find can help mitigate the interference of params among multiple models. Then they use standard model merging techniqes to merge the models.",
9+
"link": "https://arxiv.org/pdf/2311.03099"
10+
},
211
{
312
"title": "Training-Free Pretrained Model Merging",
413
"author": "Zhengqi Xu et al",

papers_read.html

Lines changed: 12 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -75,10 +75,10 @@ <h1>Here's where I keep a list of papers I have read.</h1>
7575
I typically use this to organize papers I found interesting. Please feel free to do whatever you want with it. Note that this is not every single paper I have ever read, just a collection of ones that I remember to put down.
7676
</p>
7777
<p id="paperCount">
78-
So far, we have read 186 papers. Let's keep it up!
78+
So far, we have read 187 papers. Let's keep it up!
7979
</p>
8080
<small id="searchCount">
81-
Your search returned 186 papers. Nice!
81+
Your search returned 187 papers. Nice!
8282
</small>
8383

8484
<div class="search-inputs">
@@ -105,6 +105,16 @@ <h1>Here's where I keep a list of papers I have read.</h1>
105105
</thead>
106106
<tbody>
107107

108+
<tr>
109+
<td>Language Models are Super Mario: Absorbing Abilities from Homologous Models as a Free Lunch</td>
110+
<td>Le Yu et al</td>
111+
<td>2024</td>
112+
<td>model merging</td>
113+
<td>ICML</td>
114+
<td>This paper shows that language models (LMs) can get new abilities via assimilating params from homologous models. They also note that LMs after Supervised Fine-Tuning (SFT) have many redundant delta parameters (i.e, the alteration of the model params before and after SFT). They then present DARE (Drop And REscale) as a means of setting delta parameters to zero with drop rate of p and then rescaling the remaining ones by a factor of 1/(1-p). They then use DARE to remove redundant delta parameters in each model prior to merging, which they find can help mitigate the interference of params among multiple models. Then they use standard model merging techniqes to merge the models.</td>
115+
<td><a href="https://arxiv.org/pdf/2311.03099" target="_blank">Link</a></td>
116+
</tr>
117+
108118
<tr>
109119
<td>Training-Free Pretrained Model Merging</td>
110120
<td>Zhengqi Xu et al</td>

0 commit comments

Comments
 (0)