Skip to content

Commit fdd5e75

Browse files
committed
Updated on 2024-12-26
1 parent f64fa93 commit fdd5e75

File tree

2 files changed

+21
-2
lines changed

2 files changed

+21
-2
lines changed

papers/list.json

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,13 @@
11
[
2+
{
3+
"title": "Remix-DiT: Mixing Diffusion Transformers for Multi-Expert Denoising",
4+
"author": "Gongfan Fang et al",
5+
"year": "2024",
6+
"topic": "dit, diffusion, moe",
7+
"venue": "NeurIPS",
8+
"description": "This paper introduces a method of mixing diffusion models for multi-expert denoising. Basically, they increase the width of the linear layers by a factor of K, and then modify the forward pass to support it. This allows for K experts that are initialized from the original weights. ",
9+
"link": "https://arxiv.org/pdf/2412.05628"
10+
},
211
{
312
"title": "Hymba: A Hybrid-head Architecture for Small Language Models",
413
"author": "Xin Dong et al",

papers_read.html

Lines changed: 12 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -16,10 +16,10 @@ <h1>Here's where I keep a list of papers I have read.</h1>
1616
I typically use this to organize papers I found interesting. Please feel free to do whatever you want with it. Note that this is not every single paper I have ever read, just a collection of ones that I remember to put down.
1717
</p>
1818
<p id="paperCount">
19-
So far, we have read 194 papers. Let's keep it up!
19+
So far, we have read 195 papers. Let's keep it up!
2020
</p>
2121
<small id="searchCount">
22-
Your search returned 194 papers. Nice!
22+
Your search returned 195 papers. Nice!
2323
</small>
2424

2525
<div class="search-inputs">
@@ -46,6 +46,16 @@ <h1>Here's where I keep a list of papers I have read.</h1>
4646
</thead>
4747
<tbody>
4848

49+
<tr>
50+
<td>Remix-DiT: Mixing Diffusion Transformers for Multi-Expert Denoising</td>
51+
<td>Gongfan Fang et al</td>
52+
<td>2024</td>
53+
<td>dit, diffusion, moe</td>
54+
<td>NeurIPS</td>
55+
<td>This paper introduces a method of mixing diffusion models for multi-expert denoising. Basically, they increase the width of the linear layers by a factor of K, and then modify the forward pass to support it. This allows for K experts that are initialized from the original weights. </td>
56+
<td><a href="https://arxiv.org/pdf/2412.05628" target="_blank">Link</a></td>
57+
</tr>
58+
4959
<tr>
5060
<td>Hymba: A Hybrid-head Architecture for Small Language Models</td>
5161
<td>Xin Dong et al</td>

0 commit comments

Comments
 (0)