-
Notifications
You must be signed in to change notification settings - Fork 64
Addition of generic / introductory glossary #222
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
thomasgudjonwright
wants to merge
6
commits into
JuliaDiff:main
Choose a base branch
from
thomasgudjonwright:master
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from 3 commits
Commits
Show all changes
6 commits
Select commit
Hold shift + click to select a range
ac24ca7
Initial addition of glossary
thomasgudjonwright 75d2d39
Add differential type defs
thomasgudjonwright 0283a39
Typo fix
thomasgudjonwright 7b98723
Adding Automatic Differentiation def
thomasgudjonwright 3146ca3
Addressing comments
thomasgudjonwright d854bce
Adding internal and external links
thomasgudjonwright File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,81 @@ | ||
# ChainRules Glossary | ||
|
||
This glossary serves as a quick reference for common terms used in the field of Automatic Differentiation, as well as those used throughout the documentation relating specifically to ChainRules. | ||
|
||
##Definitions: | ||
|
||
###Adjoint: | ||
|
||
The conjugate transpose of the Jacobian for a given function `f`. | ||
|
||
###Derivative: | ||
|
||
The derivative of a function `y = f(x)` with respect to the independent variable `x` denoted `f'(x)` or `dy/dx` is the rate of change of the dependent variable `y` with respect to the change of the independent variable `x`. In multiple dimensions, we may refer to the gradient of a function. | ||
thomasgudjonwright marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
###Differential: | ||
|
||
The differential of a given function `y = f(x)` denoted `dy` is the product of the derivative function `f'(x)` and the increment of the independent variable `dx`. In multiple dimensions, it is the sum of these products across each dimension (using the partial derivative and the given independent variable's increment). | ||
|
||
In ChainRules, differentials are types ("differential types") and correspond to primal types. A differential should represent a difference between two primal values. | ||
thomasgudjonwright marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
####Natural Differential: | ||
|
||
A natural differential type for a given primal type is the type people would intuitively associate with representing the difference between two values of the primal type. | ||
thomasgudjonwright marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
thomasgudjonwright marked this conversation as resolved.
Show resolved
Hide resolved
|
||
####Structural Differential: | ||
|
||
If a given primal type `P` does not have a natural differential, we need to come up with one that makes sense. These are called structural differentials and are represented as `Composite{P, <:NamedTuple}`. | ||
thomasgudjonwright marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
####Semi-Structural Differential: | ||
|
||
A structural differential that contains at least one natural differential field. | ||
thomasgudjonwright marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
####Thunk: | ||
|
||
thomasgudjonwright marked this conversation as resolved.
Show resolved
Hide resolved
|
||
An "unnatural" differential type. If we wish to delay the computation of a derivative for whatever reason, we wrap it in a `Thunk` or `ImplaceableThunk`. It holds off on computing the wrapped derivative until it is needed. | ||
thomasgudjonwright marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
####Zero: | ||
|
||
`Zero()` can also be a differential type. If you have trouble understanding the rules enforced upon differential types, consider this one first, as `Zero()` is the trivial vector space. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. this sentence is unclear There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Updated with more fundamental info |
||
|
||
###Directional Derivative: | ||
|
||
The directional derivative of a function `f` at any given point in any given unit-direction is the gradient multiplied by the direction. It represents the rate of change of `f` in that direction. | ||
thomasgudjonwright marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
###F-rule: | ||
|
||
A function used in forward-mode differentiation. For a given function `f`, it takes in the positional and keyword arguments of `f` and returns the primal result and the pushforward. | ||
thomasgudjonwright marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
###Gradient: | ||
|
||
The gradient of a scalar function `f` represented by `∇f` is a vector function whose components are the partial derivatives of `f` with respect to each dimension of the domain of `f`. | ||
thomasgudjonwright marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
###Jacobian: | ||
|
||
The Jacobian of a vector-valued function `f` is the matrix of `f`'s first-order partial derivatives. | ||
|
||
###Jacobian Transpose Vector Product (j'vp): | ||
thomasgudjonwright marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
The product of the adjoint of the Jacobian and the vector in question. A description of the pullback in terms of its Jacobian. | ||
|
||
###Jacobian Vector Product (jvp): | ||
thomasgudjonwright marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
The product of the Jacobian and the vector in question. It is a description of the pushforward in terms of its Jacobian. | ||
thomasgudjonwright marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
###Primal: | ||
|
||
Something relating to the original problem, as opposed to relating to the derivative. In ChainRules, primals are types ("primal types"). | ||
thomasgudjonwright marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
###Pullback: | ||
|
||
`Pullback(f)` describes the sensitivity of the input of `f` as a function of (for the relative change to) the sensitivity of the output of `f`. Can be represented as the dot product of a vector (left) the adjoint Jacobian (right). | ||
thomasgudjonwright marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
###Pushforward: | ||
|
||
`Pushforward(f)` describes the sensitivity of the output of `f` as a function of (for the relative change to) the sensitivity of the input of `f`. Can be represented as the dot product of the Jacobian (left) and a vector (right). | ||
thomasgudjonwright marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
###R-rule: | ||
|
||
thomasgudjonwright marked this conversation as resolved.
Show resolved
Hide resolved
|
||
A function used in reverse-mode differentiation. For a given function `f`, it takes in the positional and keyword arguments of `f` and returns the primal result and the pullback. | ||
|
||
|
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the adjoint always the conjugate transpose of the Jacobian specifically? In the below definitions which reference the adjoint, it's always the "adjoint of the Jacobian".
This makes me think the adjoint definition should be "The conjugate transpose of a matrix" and the subsequent definitions can refer to "the adjoint of the Jacobian".
Alternatively if, when we mention the adjoint, we're always talking about the adjoint of a Jacobian, this definition can stay as it is and in subsequent definitions we can just say "the adjoint".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably we want to title the Adjoint of a function.
ANd start by saying that the adjoint of a matrix is another word for it's conjugate transpose.
Then mentioning that it can also be applied to a linear operator as every linear operator can be described as
y = Jx
and that as an adjoint linear operator ofy' = x'J'
.Then say that people say as a shorthand/abuse of terminoly the adjoint of a function,
when what they actually mean is to get a function which is the adjoint of pushfoward linear operator.
The pushforward linear operator is the the linear operator that has the same jacobian as the function at that point.
The pullback is its adjoint.
linearization of the function at a point, to get a linear operator (the
pushforward
),and then
Then say that people occationally say the adjoint of a function,
when what they really mean is: the adjoint of the jacobian of the function,
or they mean the pullback.
Sometimes people say adjoint of a function to mean pullback.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I totally agree this def isn't sufficient. The adjoint is super broad as a term, so I am having a hard time figuring out how much / how little to include