The following section, along with [[emmy.collection]] and [[emmy.differential]], rounds out the implementations of [[d/IPerturbed]] for native Clojure(script) data types. The function implementation is subtle, as described by Manzyuk et al. 2019. ([[emmy.derivative.calculus-test]], in the "Amazing Bug" sections, describes the pitfalls at length.)
[[emmy.differential]] describes how each in-progress perturbed variable in a derivative is assigned a "tag" that accumulates the variable's partial derivative.
How do we interpret the case where ((D f) x)
produces a function?
Manzyuk et al. 2019 extends D
to functions f
of type Loading..., where
By viewing
f
as a (maybe curried) multivariable function that eventually must produce an Loading...(D f)
as the partial derivative with respect to the first argument of f
A 3-level nest of functions will respond to D
just like the flattened, non-higher-order version would respond to (partial 0)
. In other words, these two forms should evaluate to equivalent results:
=> (* y z)
To extract-tangent
from a function, we need to compose the extract-tangent
operation with the returned function.
The returned function needs to capture an internal reference to the original [[d/Differential]] input. This is true for any Functor-shaped return value, like a structure or Map. However! There is a subtlety present with functions that's not present with vectors or other containers.
The difference with functions is that they take inputs. If you contrive a situation where you can feed the original captured [[d/Differential]] into the returned function, this can trigger "perturbation confusion", where two different layers try to extract the tangent corresponding to the SAME tag, and one is left with nothing.
If you engineer an example (see [[emmy.calculus.derivative-test/amazing-bug]]) where:
x
as an argumentx
instances can both be multipliedThen your program isn't going to make any distinction between the instances of x
. They're both references to the same value.
HOWEVER! ((D f) x)
returns a function which, when you eventually provide all arguments, will return the sensitivity of f
to the first argument x
.
If you perform the trick above, pass ((D f) x)
into itself, and the x
instances meet (multiply, say) - should final return value treat them as the /same/ instance?
Manzyuk et al. says NO!. If ((D f) x)
returns a function, that function closes over:
x
x
once the final argument is supplied.How does the implementation keep the values separate?
The key to the solution lives in [[extract-tangent-fn]], called on the result of ((D f) x)
when ((D f) x)
produces a function. We have to armor the returned function so that:
We do this by:
tag
in the returned function's arguments with a temporary tag (let's call it fresh
)tag
, as requested (note now that the only instances of tag
that can appear in the result come from variables captured in the function's closure)fresh
back to tag
inside the remaining [[d/Differential]] instance.This last step ensures that any tangent tagged with tag
in the input can make it back out without tangling with closure-captured tag
instances that some higher level might want.
NOTE: that the tag-remapping that the docstring for extract-tag-fn
describes might also have to apply to a functional argument!
replace-tag
on a function is meant to be a replace-tag
call applied to the function's output. To prevent perturbation confusion inside the function, we perform a similar remapping of any occurrence of tag
in the function's arguments.
The implementation for functions handles functions, multimethods, and, in ClojureScript, [[MetaFn]] instances. Metadata in the original function is preserved through tag replacement and extraction.
These functions put together the pieces laid out in [[emmy.differential]] and declare an interface for taking derivatives.
The result of applying the derivative (D f)
of a multivariable function f
to a sequence of args
is a structure of the same shape as args
with all orientations flipped. (For a partial derivative like ((partial 0 1) f)
the result has the same-but-flipped shape as (get-in args [0 1])
.)
args
is coerced into an up
structure. The only special case where this does not happen is if (= 1 (count args))
.
To generate the result:
(derivative f)
xs
xs'
by replacing each entry in xs
with ((derivative f') entry)
, where f'
is a function of ONLY that entry that calls (f (assoc-in xs path entry))
. In other words, replace each entry with the result of the partial derivative of f
at only that entry.(s/transpose xs')
(the same structure with all orientations flipped.)A multivariable derivative is a multiple-arity function that performs the above.
[[jacobian]] handles this main logic. [[jacobian]] can only take a structural input. [[euclidean]] and [[multivariate]] below widen handle, respectively, optionally-structural and multivariable arguments.
[[g/partial-derivative]] is meant to produce either a full Jacobian or some entry specified by a selectors
vector.
When called on a function f
, [[g/partial-derivative]] returns a function wrapped in the machinery provided by [[multivariate]]; this allows the same operator to serve functions of:
NOTE: The reason that this implementation is also installed for [[emmy.structure/Structure]] is that structures act as functions that apply their args to every (functional) entry. Calling (multivariate structure selectors)
allows all of the machinery that handles structure-walking and argument conversion to run a SINGLE time before getting passed to the structure of functions, instead of separately for every entry in the structure.
TODO: I think this is going to cause problems for, say, a Structure of PowerSeries, where there is actually a cheap g/partial-derivative
implementation for the components. I vote to back out this ::s/structure
installation.
This section exposes various differential operators as [[o/Operator]] instances.
Functions that make use of the differential operators defined above in standard ways.