Skip to content

Commit e8f46dd

Browse files
committed
Fixed
1 parent 036c7dd commit e8f46dd

File tree

1 file changed

+3
-3
lines changed

1 file changed

+3
-3
lines changed

include/nbl/builtin/hlsl/fft/README.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -138,8 +138,8 @@ On the other column, just write all indices from 0 to $\text{FFTSize} - 1$ but b
138138

139139
Now we want to match these to find a rule. Here's an idea: The first column maps an index $n$ (the position in the column, counting from the top) in the range $[0, \text{FFTSize} - 1]$ to an index in the output $\text{NFFT}$ array (which is in the same range), let's call this mapping $e$ for enumeration. This mapping depends on both $\log_2(\text{ElementsPerInvocation})$ and $\log_2(\text{WorkgroupSize})$.
140140

141-
The second column is also a mapping of $n$ to an index in the (correctly ordered) $\text{DFT}$, and in fact we know this mapping to be $n \rightarrow \text{bitreverse}(n)$. We're almost done! Matching the columns like we have been, we now know that on line $n$ we will read "$e(n)$ holds $\text{bitreverse}(n)$", which means that $\text{NFFT}[e(n)] = \text{DFT}[\text{bitreverse}(n)]$.
142-
Then, line $e^{-1}(n)$ will read "$n$ holds $\text{bitreverse}(e^{-1}(n))$", which we interpret as $$\text{NFFT}[n] = \text{DFT}[\text{bitreverse}(e^{-1}(n))]$$
141+
The second column is also a mapping of $n$ to an index in the (correctly ordered) $\text{DFT}$, and in fact we know this mapping to be $n \rightarrow \text{bitreverse}(n)$. We're almost done! Matching the columns like we have been, we now know that on line $n$ we will read " $e(n)$ holds $\text{bitreverse}(n)$", which means that $\text{NFFT}[e(n)] = \text{DFT}[\text{bitreverse}(n)]$.
142+
Then, line $e^{-1}(n)$ will read " $n$ holds $\text{bitreverse}(e^{-1}(n))$", which we interpret as $$\text{NFFT}[n] = \text{DFT}[\text{bitreverse}(e^{-1}(n))]$$
143143

144144
So now let's figure out $e$! It turns out $e(n)$ can be computed with a circular right shift by one position of the lower $N - E + 1$ bits of $n$, where $N = \log_2(\text{FFTSize})$ (which makes $N$ the total number of bits needed to represent indices) and $E = \log_2(\text{ElementsPerInvocation})$.
145145
We have that $N - E = W = \log_2(\text{WorkgroupSize})$ so it might be easier to think of it as the circular right shift by one position of the lower $W + 1$ bits.
@@ -199,7 +199,7 @@ Here's an observation: each thread holds $\text{ElementsPerInvocation}$ elements
199199
$$\text{NFFT[threadID]}, \;\text{NFFT[threadID} + \text{WorkgroupSize}], \;\dots\;, \;\text{NFFT[threadID} + (\text{ElementsPerInvocation} - 1) * \text{WorkgroupSize}].$$
200200

201201
So the positions in the output NFFT array are parameterized by $\text{threadID} + k * \text{WorkgroupSize}$, where $\;0 \le k < \text{ElementsPerInvocation}$. We call an element of a thread locally if its index is obtained from an even value of $k$ in the previous parameterization. Enumerating the locally even elements in order produces a bitreversed lower half of the $\text{DFT}$. That is, the sequence $$\text{NFFT}[0]\;,\; \text{NFFT}[1]\;,\; ...\;,\; \text{NFFT}[\text{WorkgroupSize} - 1]\;,\; \\ \text{NFFT}[0 + 2 \cdot \text{WorkgroupSize}]\;,\;\text{NFFT}[1 + 2 \cdot \text{WorkgroupSize}]\;,\; \dots\;,\; \text{NFFT}[(\text{WorkgroupSize} - 1) + 2 \cdot \text{WorkgroupSize}]\;,\; \\ \text{NFFT}[0 + 4 \cdot \text{WorkgroupSize}]\;,\; \text{NFFT}[1 + 4 \cdot \text{WorkgroupSize}]\;,\; \dots\;,\; \text{NFFT}[(\text{WorkgroupSize} - 1) + 4 \cdot \text{WorkgroupSize}]\;,\; \\ \vdots \\ \text{NFFT}[0 + (\text{ElementsPerInvocation - 2}) \cdot \text{WorkgroupSize}]\;,\; \dots\;,\; \text{NFFT}[(\text{WorkgroupSize} - 1) + (\text{ElementsPerInvocation - 2}) \cdot \text{WorkgroupSize}]$$
202-
turns out to be exactly the lower half of the $\text{DFT}$ $($ elements $0$ through $\text{Nyquist} - 1 = \frac N 2 - 1)$ bitreversed *taking the indices as $N-1$ bit numbers* (so not taking into account the MSB of $0$ they would have as indices in the whole $\text{DFT}$).
202+
turns out to be exactly the lower half of the $\text{DFT}$ $($ elements $0$ through $\text{Nyquist} - 1 = \frac N 2 - 1)$ bitreversed *taking the indices as* $N-1$ *bit numbers* (so not taking into account the MSB of $0$ they would have as indices in the whole $\text{DFT}$).
203203

204204
For a proof, consider the lower half of the $\text{DFT}$. These are all indexed by $0|0$ through $0|\text{Nyquist} - 1$ with the $0$ before the $|$ being a single bit. Then, we map $0|n$ to $0|\text{bitreverse}(n)$, where it's an $N-1$ bit bitreversal. We want to show that the enumeration $n \to 0|\text{bitreverse}(n)$ of the lower half of the $\text{DFT}$'s indices yields the same elements (in the same order of enumeration) as taking the locally even elements in increasing order.
205205

0 commit comments

Comments
 (0)