Updated on 2025-01-16

lxaw · lxaw · commit a4548b0f6cfe · 2025-01-16T15:42:50.000-05:00
diff --git a/index.html b/index.html
@@ -64,7 +64,7 @@ <h1>Where?</h1>
         </section>
         <section>
             <h1>When?</h1>
-        Last time this was edited was 2025-01-11 (YYYY/MM/DD).
+        Last time this was edited was 2025-01-16 (YYYY/MM/DD).
         </section>
         <footer>
             <small><a href="misc.html">misc</a></small>
diff --git a/papers/list.json b/papers/list.json
@@ -1,4 +1,22 @@
 [
+  {
+    "title": "Think Before You Speak: Training Language Models with Pause Tokens",
+    "author": "Sachin Goyal et al",
+    "year": "2024",
+    "topic": "test-time compute, meta-tokens",
+    "venue": "Arxiv",
+    "description": "This paper introduces \"Pause Tokens\" which are a way of appending a sequence of tokens to the input prefix, and then delaying the output until the last pause token is seen.",
+    "link": "https://arxiv.org/pdf/2310.02226"
+  },
+  {
+    "title": "Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters",
+    "author": "Charlie Snell et al",
+    "year": "2024",
+    "topic": "test-time compute",
+    "venue": "Arxiv",
+    "description": "This paper explores the question of \"If an LLM is allowed to use a fixed but non-trivial amount of inference-time compute, how much can it improve its performance on a challenge prompt?\". Good for references on various test-time compute strategies.",
+    "link": "https://arxiv.org/pdf/2408.03314"
+  },
   {
     "title": "Medusa: Simple LLM Inference Acceleration Framework with Multiple Decoding Heads",
     "author": "Tianle Cai et al",
diff --git a/papers_read.html b/papers_read.html
@@ -16,10 +16,10 @@ <h1>Here's where I keep a list of papers I have read.</h1>
         I typically use this to organize papers I found interesting. Please feel free to do whatever you want with it. Note that this is not every single paper I have ever read, just a collection of ones that I remember to put down.
     </p>
     <p id="paperCount">
-        So far, we have read 206 papers. Let's keep it up!
+        So far, we have read 208 papers. Let's keep it up!
     </p> 
     <small id="searchCount">
-        Your search returned 206 papers. Nice! 
+        Your search returned 208 papers. Nice! 
     </small>
     
     <div class="search-inputs">
@@ -46,6 +46,26 @@ <h1>Here's where I keep a list of papers I have read.</h1>
         </thead>
         <tbody>
         
+            <tr>
+                <td>Think Before You Speak: Training Language Models with Pause Tokens</td>
+                <td>Sachin Goyal et al</td>
+                <td>2024</td>
+                <td>test-time compute, meta-tokens</td>
+                <td>Arxiv</td>
+                <td>This paper introduces &quot;Pause Tokens&quot; which are a way of appending a sequence of tokens to the input prefix, and then delaying the output until the last pause token is seen.</td>
+                <td><a href="https://arxiv.org/pdf/2310.02226" target="_blank">Link</a></td>
+            </tr>
+        
+            <tr>
+                <td>Scaling LLM Test-Time Compute Optimally can be More Effective than Scaling Model Parameters</td>
+                <td>Charlie Snell et al</td>
+                <td>2024</td>
+                <td>test-time compute</td>
+                <td>Arxiv</td>
+                <td>This paper explores the question of &quot;If an LLM is allowed to use a fixed but non-trivial amount of inference-time compute, how much can it improve its performance on a challenge prompt?&quot;. Good for references on various test-time compute strategies.</td>
+                <td><a href="https://arxiv.org/pdf/2408.03314" target="_blank">Link</a></td>
+            </tr>
+        
             <tr>
                 <td>Medusa: Simple LLM Inference Acceleration Framework with Multiple Decoding Heads</td>
                 <td>Tianle Cai et al</td>