(pyadjoint-extension-docs)=

# Pyadjoint Utils

CRIKit has extended Pyadjoint to add some new features. [Pyadjoint][pyadjoint] has its own
set of documentation, so only these new features are documented here.

## Core additions

```{py:currentmodule} pyadjoint_utils

```

### Jacobian assembly

Pyadjoint currently supports computing the action of the Jacobian, adjoint
Jacobian, and Hessian. However, some problems can be solved much more
efficiently with access to the full Jacobian matrix. The
{meth}`Block.evaluate_tlm_matrix` method is set up to propagate the Jacobian
forward by multiplying Jacobian of its inputs with the Block's own Jacobian. The
driver {func}`compute_jacobian_matrix` runs this computation on a
{class}`Tape` for a given set of inputs and outputs.

### ReducedFunction

The {class}`ReducedFunction` class generalizes Pyadjoint's
{class}`~pyadjoint.ReducedFunctional` class to represent functions with multiple
outputs. Instead of having a {meth}`~pyadjoint.ReducedFunctional.derivative`
function, the ReducedFunction class has {meth}`~ReducedFunction.jac_action`,
{meth}`~ReducedFunction.adj_jac_action`, and {meth}`~ReducedFunction.jac_matrix`.

### Solving

A set of computations on a tape can be represented with a reduced function
{math}`J(x)`. This idea can be extended to equations. The
{class}`ReducedEquation` represents an equation {math}`J(x) = 0` possibly
subject to some Dirichlet boundary conditions on {math}`x`.

A {class}`ReducedEquation` is instantiated with a {class}`ReducedFunction`, a
set of Dirichlet boundary condition functions, and a corresponding set of
homogeneous Dirichlet boundary conditions. A Dirichlet boundary condition
function takes the input {math}`x` and sets some of the components to fixed
values. The corresponding homogeneous condition sets those components to zero.

The {class}`SNESSolver` is currently the only solver implemented for
ReducedEquations. It uses [PETSc][petsc]'s [System of Nonlinear Equations Solver](<https://www.mcs.anl.gov/petsc/petsc-current/docs/manualpages/SNES/SNES.html>) (SNES).

## Module additions

### FEniCS

```{py:currentmodule} pyadjoint_utils.fenics_adjoint

```

The {mod}`~pyadjoint_utils.fenics_adjoint` module adds annotated functions to
convert a FEniCS function to a NumPy array: {func}`function_get_local` and {func}`function_set_local`. These functions lets arbitrary
operations be easily applied to FEniCS functions by doing the operation on the
underlying array.

### JAX

```{py:currentmodule} pyadjoint_utils.jax_adjoint

```

The {mod}`pyadjoint_utils.jax_adjoint` module provides the {class}`ndarray`
class, which is a Pyadjoint-overloaded object that wraps a
{class}`jax.numpy.ndarray`, as well as the decorator {func}`overload_jax`,
which provides access to {mod}`jax` 's JIT compilation and differentiation
facilities. Usage looks like:

```
from jax import numpy as np
import numpy as onp
from pyadjoint_utils import array, overload_jax, ReducedFunction, taylor_test, Control
from jax.tree_util import Partial as partial # or, from functools import partial

#differentiate with respect to both non-self arguments.
#Also, we want this function to recompute its internal linearization points
#when differentiated, instead of storing them
@partial(overload_jax,argnums=(0,1),checkpoint=True)
def my_func(x, y):
    return np.exp(np.trace(x @ y))


inputs = (array(onp.random.randn(3,3)),array(onp.random.randn(3,3)))
controls = [Control(x) for x in inputs]
output = my_func(*inputs)
rf = ReducedFunction(output,controls)
h = [1.0e-1 * array(onp.random.randn(*x.shape) for x in inputs)]
assert taylor_test(rf,inputs,h) >= 1.9
```

#### Sharp Edges

JAX is still a relatively new library and has some sharp edges to be aware of. A common error is of the form

```
ValueError: linearized function called on tangent values inconsistent with the original primal values: got ConcreteArray([[-0.65073081 -0.0755682 ]
           [ 0.07366651 -1.6313327 ]]) for primal aval ConcreteArray([[ 1.2067288  -0.5176796 ]
                                                                      [ 0.4400338   0.39490446]])
```

This error is almost always either a result of the tangent values being of a different shape to the primal values, or,
if the shape is the same (as is the case in the example error message shown above), the `dtype` is different.
This can happen if, for example, you directly pass parameters obtained from an `init_params` function itself obtained
from {mod}`jax.experimental.stax` to a {class}`crikit.cr.cr.CR`, since those parameters will be of dtype
{class}`jax.numpy.float32`, whereas the default CRIKit arithmetic datatype is {class}`jax.numpy.float64`.

### TensorFlow

```{py:currentmodule} pyadjoint_utils.tensorflow_adjoint

```

The {mod}`~pyadjoint_utils.tensorflow_adjoint` module allows TensorFlow
graphs to be represented as Blocks on the tape. The function
{func}`run_tensorflow_graph` runs a given tensorflow graph and records the
operation on the Pyadjoint tape, so that recomputation and adjoint computations
can be computed with Pyadjoint

### NumPy

```{py:currentmodule} pyadjoint_utils.numpy_adjoint

```

The {mod}`~pyadjoint_utils.numpy_adjoint` module allows NumPy arrays to be
recorded on the tape, which is necessary for using the new features in the two
modules above. Additionally, Autograd functions can be overloaded using
{func}`~pyadjoint_utils.numpy_adjoint.autograd.overload_autograd` to allow
Pyadjoint to access Autograd derivatives.

## Developer Notes

```{py:currentmodule} pyadjoint_utils

```

### Jacobian Matrix Structure

For a multi-output function {math}`\mathbf{J}(\mathbf{x})`, the Jacobian {math}`\frac{d\mathbf{J}}{d\mathbf{x}}` is represented as a list of lists, where each lists has the Jacobians for one output of {math}`\mathbf{J}`. For example, for a function with two outputs and three inputs, the lists will look like the following:

```{math}

\frac{d\mathbf{J}}{d\mathbf{x}} = \begin{bmatrix}
[ & \frac{dJ_1}{dx_1} & \frac{dJ_1}{dx_2} & \frac{dJ_1}{dx_3} & ], \\
[ & \frac{dJ_2}{dx_1} & \frac{dJ_2}{dx_2} & \frac{dJ_2}{dx_3} & ],
\end{bmatrix}

```

Each {math}`\frac{dJ_i}{dx_j}` is a NumPy array with shape {code}`(*J_i.shape, *x_j.shape)`.

### Jacobian Matrix Accumulation

The computation of the Jacobian matrix on a tape has each block multiply an
incoming Jacobian with its own Jacobian. This is most easily understood with an
example. Let {math}`\mathbf{x}` be a set of inputs and
{math}`\mathbf{g}(\mathbf{y})` be a block on the tape. The {math}`\mathbf{g}`
block knows its own Jacobian {math}`\frac{d\mathbf{g}}{d\mathbf{y}}`, and it
must compute the Jacobian {math}`\frac{d\mathbf{g}}{d\mathbf{x}}` given the
Jacobian {math}`\frac{d\mathbf{y}}{d\mathbf{x}}`.

```{math}

\frac{d\mathbf{g}}{d\mathbf{x}} = \frac{d\mathbf{g}}{d\mathbf{y}} \frac{d\mathbf{y}}{d\mathbf{x}}

```

Each individual output Jacobian can be computed with the following formula:

```{math}

\frac{dg_i}{dx_j} = \sum_k \frac{dg_i}{dy_k} \frac{dy_k}{dx_j}

```

Here, {math}`\frac{dg_i}{dy_k}` is a NumPy array with shape {code}`(*g_i.shape,
*y_k.shape)`. and {math}`\frac{dy_k}{dx_j}` is a NumPy array with shape
{code}`(*y_k.shape, *x_j.shape)`. These are tensors that must be contracted over
the {}`y_k` dimensions. In NumPy, this contraction can be done with the following code:

```{code}

np.tensordot(dgi_dyk, dyk_dxj, axes=len(yk.shape))

```

The Jacobian {math}`\frac{dg_i}{d\mathbf{x}}` for each output variable
{math}`g_i` is a list recorded in the corresponding BlockVariable instance using
the {meth}`BlockVariable.add_tlm_matrix` function. The
{meth}`Block.evaluate_tlm_matrix_component` should return the Jacobian (as a
list) of one of the outputs of the block.

[pyadjoint]: http://www.dolfin-adjoint.org/en/release/

[petsc]: https://www.mcs.anl.gov/petsc/