Skip to content

Commit f10f7a8

Browse files
committed
Updated on 2024-12-28
1 parent fdd5e75 commit f10f7a8

File tree

3 files changed

+28
-28
lines changed

3 files changed

+28
-28
lines changed

index.html

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -35,7 +35,7 @@ <h1>Where?</h1>
3535
</p>
3636
<h1>When?</h1>
3737
<p>
38-
Last time this was edited was 2024-12-26 (YYYY/MM/DD).
38+
Last time this was edited was 2024-12-28 (YYYY/MM/DD).
3939
</p>
4040
<small><a href="misc.html">misc</a></small>
4141
</div>

papers/list.json

Lines changed: 13 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,13 @@
11
[
2+
{
3+
"title": "Hyper-Connections",
4+
"author": "Defa Zhu et al",
5+
"year": "2024",
6+
"topic": "residual connections, hyper-connections",
7+
"venue": "Arxiv",
8+
"description": "This paper introduces hyper-connections, which is a novel alternative to residual connections. Basically, they introduce learnable depth and width connections.",
9+
"link": "https://arxiv.org/pdf/2409.19606"
10+
},
211
{
312
"title": "Remix-DiT: Mixing Diffusion Transformers for Multi-Expert Denoising",
413
"author": "Gongfan Fang et al",
@@ -1050,7 +1059,7 @@
10501059
"topic": "q-learning, reinforcement learning",
10511060
"venue": "Arxiv",
10521061
"description": "The authors present the first deep learning model that can learn complex control policies, and they teach it to play Atari 2600 games using Q-learning. Their goal was to create one net that can play as many games as possible.",
1053-
"link": "TODO"
1062+
"link": "https://arxiv.org/pdf/1312.5602"
10541063
},
10551064
{
10561065
"title": "Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Encoding",
@@ -1059,7 +1068,7 @@
10591068
"topic": "quantization, encoding, pruning",
10601069
"venue": "ICML",
10611070
"description": "A three-pronged approach to compressing nets. They prune networks, then quantize and share weights, and then apply Huffman encoding.",
1062-
"link": "TODO"
1071+
"link": "https://arxiv.org/pdf/1510.00149"
10631072
},
10641073
{
10651074
"title": "Binarized Neural Networks: Training Neural Networks with Weights and Activations Constrained to +1 or -1",
@@ -1068,7 +1077,7 @@
10681077
"topic": "quantization, efficiency, binary",
10691078
"venue": "Arxiv",
10701079
"description": "Introduction of training Binary Neural Networks, or nets with binary weights and activations. They also present experiments on deterministic vs stochastic binarization. They use the deterministic one for the most part, except for activations.",
1071-
"link": "TODO"
1080+
"link": "https://arxiv.org/pdf/1602.02830"
10721081
},
10731082
{
10741083
"title": "EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks",
@@ -1077,16 +1086,7 @@
10771086
"topic": "efficiency, scaling",
10781087
"venue": "ICML",
10791088
"description": "A study of model scaling is presented. They propose a novel scaling method to uniformly scale all dimensions of depth/width/resolution using a compound coefficient. This paper presents a method for scaling width/depth/resolution; for instance, if you want to use 2^{N} more compute resources, then you can scale by their coefficients to do so. They also quantify the relationship between width, depth, and resolution.",
1080-
"link": "TODO"
1081-
},
1082-
{
1083-
"title": "2-in-1 Accelerator: Enabling Random Precision Switch for Winning Both Adversarial Robustness and Efficiency",
1084-
"author": "Yonggan Fu et al",
1085-
"year": "2021",
1086-
"topic": "precision, adversarial, efficiency",
1087-
"venue": "ACM",
1088-
"description": "Introduction of a Random Precision Switch algorithm that has potential for defending against adversarial attacks while promoting efficiency.",
1089-
"link": "TODO"
1089+
"link": "https://arxiv.org/pdf/1905.11946"
10901090
},
10911091
{
10921092
"title": "The wake-sleep algorithm for unsupervised neural networks",

papers_read.html

Lines changed: 14 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -46,6 +46,16 @@ <h1>Here's where I keep a list of papers I have read.</h1>
4646
</thead>
4747
<tbody>
4848

49+
<tr>
50+
<td>Hyper-Connections</td>
51+
<td>Defa Zhu et al</td>
52+
<td>2024</td>
53+
<td>residual connections, hyper-connections</td>
54+
<td>Arxiv</td>
55+
<td>This paper introduces hyper-connections, which is a novel alternative to residual connections. Basically, they introduce learnable depth and width connections.</td>
56+
<td><a href="https://arxiv.org/pdf/2409.19606" target="_blank">Link</a></td>
57+
</tr>
58+
4959
<tr>
5060
<td>Remix-DiT: Mixing Diffusion Transformers for Multi-Expert Denoising</td>
5161
<td>Gongfan Fang et al</td>
@@ -1213,7 +1223,7 @@ <h1>Here's where I keep a list of papers I have read.</h1>
12131223
<td>q-learning, reinforcement learning</td>
12141224
<td>Arxiv</td>
12151225
<td>The authors present the first deep learning model that can learn complex control policies, and they teach it to play Atari 2600 games using Q-learning. Their goal was to create one net that can play as many games as possible.</td>
1216-
<td><a href="TODO" target="_blank">Link</a></td>
1226+
<td><a href="https://arxiv.org/pdf/1312.5602" target="_blank">Link</a></td>
12171227
</tr>
12181228

12191229
<tr>
@@ -1223,7 +1233,7 @@ <h1>Here's where I keep a list of papers I have read.</h1>
12231233
<td>quantization, encoding, pruning</td>
12241234
<td>ICML</td>
12251235
<td>A three-pronged approach to compressing nets. They prune networks, then quantize and share weights, and then apply Huffman encoding.</td>
1226-
<td><a href="TODO" target="_blank">Link</a></td>
1236+
<td><a href="https://arxiv.org/pdf/1510.00149" target="_blank">Link</a></td>
12271237
</tr>
12281238

12291239
<tr>
@@ -1233,7 +1243,7 @@ <h1>Here's where I keep a list of papers I have read.</h1>
12331243
<td>quantization, efficiency, binary</td>
12341244
<td>Arxiv</td>
12351245
<td>Introduction of training Binary Neural Networks, or nets with binary weights and activations. They also present experiments on deterministic vs stochastic binarization. They use the deterministic one for the most part, except for activations.</td>
1236-
<td><a href="TODO" target="_blank">Link</a></td>
1246+
<td><a href="https://arxiv.org/pdf/1602.02830" target="_blank">Link</a></td>
12371247
</tr>
12381248

12391249
<tr>
@@ -1243,17 +1253,7 @@ <h1>Here's where I keep a list of papers I have read.</h1>
12431253
<td>efficiency, scaling</td>
12441254
<td>ICML</td>
12451255
<td>A study of model scaling is presented. They propose a novel scaling method to uniformly scale all dimensions of depth/width/resolution using a compound coefficient. This paper presents a method for scaling width/depth/resolution; for instance, if you want to use 2^{N} more compute resources, then you can scale by their coefficients to do so. They also quantify the relationship between width, depth, and resolution.</td>
1246-
<td><a href="TODO" target="_blank">Link</a></td>
1247-
</tr>
1248-
1249-
<tr>
1250-
<td>2-in-1 Accelerator: Enabling Random Precision Switch for Winning Both Adversarial Robustness and Efficiency</td>
1251-
<td>Yonggan Fu et al</td>
1252-
<td>2021</td>
1253-
<td>precision, adversarial, efficiency</td>
1254-
<td>ACM</td>
1255-
<td>Introduction of a Random Precision Switch algorithm that has potential for defending against adversarial attacks while promoting efficiency.</td>
1256-
<td><a href="TODO" target="_blank">Link</a></td>
1256+
<td><a href="https://arxiv.org/pdf/1905.11946" target="_blank">Link</a></td>
12571257
</tr>
12581258

12591259
<tr>

0 commit comments

Comments
 (0)