You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
"description": "The MaskGIT paper introduces a novel bidirectional transformer architecture for image generation that can predict multiple image tokens in parallel, rather than generating them sequentially like previous methods. They develop a new iterative decoding strategy where the model predicts all masked tokens simultaneously at each step, keeps the most confident predictions, and refines the remaining tokens over multiple iterations using a decreasing mask scheduling function. The approach significantly outperforms previous transformer-based methods in both generation quality and speed on ImageNet, while maintaining good diversity in the generated samples. The bidirectional nature of their model enables flexible image editing applications like inpainting, outpainting, and class-conditional object manipulation without requiring any architectural changes or task-specific training.",
9
+
"link": "https://arxiv.org/pdf/2202.04200"
10
+
},
2
11
{
3
12
"title": "Improved Precision and Recall Metric for Assessing Generative Models",
4
13
"author": "Tuomas Kynkaanniemi et al",
5
14
"year": "2019",
6
15
"topic": "generative models, precision, recall",
7
-
"venue": "NeurIPS 2019",
16
+
"venue": "NeurIPS",
8
17
"description": "This paper introduces an improved metric for evaluating generative models by separately measuring precision (quality of generated samples) and recall (coverage/diversity of generated distribution) using k-nearest neighbors to construct non-parametric manifold approximations of real and generated data distributions. The authors demonstrate their metric's effectiveness using StyleGAN and BigGAN, showing how it provides more nuanced insights than existing metrics like FID, particularly in revealing tradeoffs between image quality and variation that other metrics obscure. They use their metric to analyze and improve StyleGAN's architecture and training configurations, identifying new variants that achieve state-of-the-art results, and perform the first principled analysis of truncation methods. Finally, they extend their metric to evaluate individual sample quality, enabling quality assessment of interpolations and providing insights into the shape of the latent space that produces realistic images.",
Copy file name to clipboardExpand all lines: papers_read.html
+14-4Lines changed: 14 additions & 4 deletions
Original file line number
Diff line number
Diff line change
@@ -75,10 +75,10 @@ <h1>Here's where I keep a list of papers I have read.</h1>
75
75
I typically use this to organize papers I found interesting. Please feel free to do whatever you want with it. Note that this is not every single paper I have ever read, just a collection of ones that I remember to put down.
76
76
</p>
77
77
<pid="paperCount">
78
-
So far, we have read 147 papers. Let's keep it up!
78
+
So far, we have read 148 papers. Let's keep it up!
79
79
</p>
80
80
<smallid="searchCount">
81
-
Your search returned 147 papers. Nice!
81
+
Your search returned 148 papers. Nice!
82
82
</small>
83
83
84
84
<divclass="search-inputs">
@@ -105,14 +105,24 @@ <h1>Here's where I keep a list of papers I have read.</h1>
<td>The MaskGIT paper introduces a novel bidirectional transformer architecture for image generation that can predict multiple image tokens in parallel, rather than generating them sequentially like previous methods. They develop a new iterative decoding strategy where the model predicts all masked tokens simultaneously at each step, keeps the most confident predictions, and refines the remaining tokens over multiple iterations using a decreasing mask scheduling function. The approach significantly outperforms previous transformer-based methods in both generation quality and speed on ImageNet, while maintaining good diversity in the generated samples. The bidirectional nature of their model enables flexible image editing applications like inpainting, outpainting, and class-conditional object manipulation without requiring any architectural changes or task-specific training.</td>
<td>Improved Precision and Recall Metric for Assessing Generative Models</td>
110
120
<td>Tuomas Kynkaanniemi et al</td>
111
121
<td>2019</td>
112
122
<td>generative models, precision, recall</td>
113
-
<td>NeurIPS 2019</td>
123
+
<td>NeurIPS</td>
114
124
<td>This paper introduces an improved metric for evaluating generative models by separately measuring precision (quality of generated samples) and recall (coverage/diversity of generated distribution) using k-nearest neighbors to construct non-parametric manifold approximations of real and generated data distributions. The authors demonstrate their metric's effectiveness using StyleGAN and BigGAN, showing how it provides more nuanced insights than existing metrics like FID, particularly in revealing tradeoffs between image quality and variation that other metrics obscure. They use their metric to analyze and improve StyleGAN's architecture and training configurations, identifying new variants that achieve state-of-the-art results, and perform the first principled analysis of truncation methods. Finally, they extend their metric to evaluate individual sample quality, enabling quality assessment of interpolations and providing insights into the shape of the latent space that produces realistic images.</td>
0 commit comments