Wirtinger derivative using JVP in jax #26576
Unanswered
kcdodd
asked this question in
Show and tell
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
I am working to better understand how automatic differentiation relates to more theoretical complex analysis of non-holomorphic functions, and looking for feedback on the following way of thinking or explaining it.
As a theoretical beginning, defining the JVP ("Jacobian vector prduct") as the directional derivative$J(v)$ ,
The JVP doesn't always give what might be expected because the value of$J(v)$ is "like" the value of $f(z_0)$ . For example $f(z) = |z|^2$ the value of $f$ is real, so the JVP has to be real regardless of the tangent $v$ . Treating the mapping like a matrix-vector product is troubled for complex number: e.g. for complex $c$ , $J(c v) \ne c J(v)$ . Complex numbers are like 2D transformations that can scale and rotate coordinates in the complex plane, in addition to also being coordinates themselves. For example, with the matrix representation of a complex number,
But complex numbers cannot represent things like conjugation (reflection), so the Jacobian does not always have a representation as a purely complex-valued matrix. So starting instead with the 2D (real) coordinate representation, and a general real-valued 2x2 matrix (transformation),
the question becomes how to reconstruct the result of the transformation using JVP. The first response would be to separately compute the real and imaginary directions, and basically yes, but more exactly how to combine the results. In the case of$f(z) = |z|^2$ for example, the JVP ends up aligned with the real axis no matter how the tangent is oriented. The following approach is to decompose $A$ as a linear combination of elementary transformations,
The first two matrices are to be identified with orientation preserving scale and rotation ($1$ and $i$ ), while the second two involve reflection/conjugation,
They also have these identities,
Since$\sigma_4 = \sigma_2 \sigma_3$ the decomposition can be expressed with only $\sigma_1$ , $\sigma_2$ , and $\sigma_3$ , which gives a nice picture of how to connect the action on complex values to a general real-valued matrix,
Now suppose a second matrix is defined by a similarity transformation using$\sigma_2$ (aka $i$ ), which basically rotates by $\pi/2$ before the matrix and then rotates back,
The reflected part is reversed because it rotated in the opposite direction. The sum$A + A^\prime$ will keep only the un-reflected part, while the difference $A - A^\prime$ will keep only the reflected part. In the Wirtinger derivatives, $z$ and $\bar{z}$ are treated as though they were independent coordinates of an effective function $f(z, \bar{z})$ . Identifying these with separate actions on a complex value $z$ and its conjugate $\bar{z}$ , an equivalent of the Wirtinger derivatives can be defined for the JVP. Substituting the matrix representations, $A \to J(v)$ and $A^\prime \to - i J(i v)$ ,
Something to consider when interpreting the conjugate derivative is that it is not quite the same thing as the derivative along the tangent's conjugate direction$\bar{v}$ , but only the part where the original function depended on $\bar{z}$ . For a holomorphic function $J_{\bar{z}}(v)$ is zero, while using the conjugate tangent in $J_{z}(\bar{v})$ is just taking the derivative with respect to $z$ along a different direction.
An example implementation using
jax
is shown in the code below.Beta Was this translation helpful? Give feedback.
All reactions