(pyadjoint-extension-docs)= # Pyadjoint Utils CRIKit has extended Pyadjoint to add some new features. [Pyadjoint][pyadjoint] has its own set of documentation, so only these new features are documented here. ## Core additions ```{py:currentmodule} pyadjoint_utils ``` ### Jacobian assembly Pyadjoint currently supports computing the action of the Jacobian, adjoint Jacobian, and Hessian. However, some problems can be solved much more efficiently with access to the full Jacobian matrix. The {meth}`Block.evaluate_tlm_matrix` method is set up to propagate the Jacobian forward by multiplying Jacobian of its inputs with the Block's own Jacobian. The driver {func}`compute_jacobian_matrix` runs this computation on a {class}`Tape` for a given set of inputs and outputs. ### ReducedFunction The {class}`ReducedFunction` class generalizes Pyadjoint's {class}`~pyadjoint.ReducedFunctional` class to represent functions with multiple outputs. Instead of having a {meth}`~pyadjoint.ReducedFunctional.derivative` function, the ReducedFunction class has {meth}`~ReducedFunction.jac_action`, {meth}`~ReducedFunction.adj_jac_action`, and {meth}`~ReducedFunction.jac_matrix`. ### Solving A set of computations on a tape can be represented with a reduced function {math}`J(x)`. This idea can be extended to equations. The {class}`ReducedEquation` represents an equation {math}`J(x) = 0` possibly subject to some Dirichlet boundary conditions on {math}`x`. A {class}`ReducedEquation` is instantiated with a {class}`ReducedFunction`, a set of Dirichlet boundary condition functions, and a corresponding set of homogeneous Dirichlet boundary conditions. A Dirichlet boundary condition function takes the input {math}`x` and sets some of the components to fixed values. The corresponding homogeneous condition sets those components to zero. The {class}`SNESSolver` is currently the only solver implemented for ReducedEquations. It uses [PETSc][petsc]'s [System of Nonlinear Equations Solver]() (SNES). ## Module additions ### FEniCS ```{py:currentmodule} pyadjoint_utils.fenics_adjoint ``` The {mod}`~pyadjoint_utils.fenics_adjoint` module adds annotated functions to convert a FEniCS function to a NumPy array: {func}`function_get_local` and {func}`function_set_local`. These functions lets arbitrary operations be easily applied to FEniCS functions by doing the operation on the underlying array. ### JAX ```{py:currentmodule} pyadjoint_utils.jax_adjoint ``` The {mod}`pyadjoint_utils.jax_adjoint` module provides the {class}`ndarray` class, which is a Pyadjoint-overloaded object that wraps a {class}`jax.numpy.ndarray`, as well as the decorator {func}`overload_jax`, which provides access to {mod}`jax` 's JIT compilation and differentiation facilities. Usage looks like: ``` from jax import numpy as np import numpy as onp from pyadjoint_utils import array, overload_jax, ReducedFunction, taylor_test, Control from jax.tree_util import Partial as partial # or, from functools import partial #differentiate with respect to both non-self arguments. #Also, we want this function to recompute its internal linearization points #when differentiated, instead of storing them @partial(overload_jax,argnums=(0,1),checkpoint=True) def my_func(x, y): return np.exp(np.trace(x @ y)) inputs = (array(onp.random.randn(3,3)),array(onp.random.randn(3,3))) controls = [Control(x) for x in inputs] output = my_func(*inputs) rf = ReducedFunction(output,controls) h = [1.0e-1 * array(onp.random.randn(*x.shape) for x in inputs)] assert taylor_test(rf,inputs,h) >= 1.9 ``` #### Sharp Edges JAX is still a relatively new library and has some sharp edges to be aware of. A common error is of the form ``` ValueError: linearized function called on tangent values inconsistent with the original primal values: got ConcreteArray([[-0.65073081 -0.0755682 ] [ 0.07366651 -1.6313327 ]]) for primal aval ConcreteArray([[ 1.2067288 -0.5176796 ] [ 0.4400338 0.39490446]]) ``` This error is almost always either a result of the tangent values being of a different shape to the primal values, or, if the shape is the same (as is the case in the example error message shown above), the `dtype` is different. This can happen if, for example, you directly pass parameters obtained from an `init_params` function itself obtained from {mod}`jax.experimental.stax` to a {class}`crikit.cr.cr.CR`, since those parameters will be of dtype {class}`jax.numpy.float32`, whereas the default CRIKit arithmetic datatype is {class}`jax.numpy.float64`. ### TensorFlow ```{py:currentmodule} pyadjoint_utils.tensorflow_adjoint ``` The {mod}`~pyadjoint_utils.tensorflow_adjoint` module allows TensorFlow graphs to be represented as Blocks on the tape. The function {func}`run_tensorflow_graph` runs a given tensorflow graph and records the operation on the Pyadjoint tape, so that recomputation and adjoint computations can be computed with Pyadjoint ### NumPy ```{py:currentmodule} pyadjoint_utils.numpy_adjoint ``` The {mod}`~pyadjoint_utils.numpy_adjoint` module allows NumPy arrays to be recorded on the tape, which is necessary for using the new features in the two modules above. Additionally, Autograd functions can be overloaded using {func}`~pyadjoint_utils.numpy_adjoint.autograd.overload_autograd` to allow Pyadjoint to access Autograd derivatives. ## Developer Notes ```{py:currentmodule} pyadjoint_utils ``` ### Jacobian Matrix Structure For a multi-output function {math}`\mathbf{J}(\mathbf{x})`, the Jacobian {math}`\frac{d\mathbf{J}}{d\mathbf{x}}` is represented as a list of lists, where each lists has the Jacobians for one output of {math}`\mathbf{J}`. For example, for a function with two outputs and three inputs, the lists will look like the following: ```{math} \frac{d\mathbf{J}}{d\mathbf{x}} = \begin{bmatrix} [ & \frac{dJ_1}{dx_1} & \frac{dJ_1}{dx_2} & \frac{dJ_1}{dx_3} & ], \\ [ & \frac{dJ_2}{dx_1} & \frac{dJ_2}{dx_2} & \frac{dJ_2}{dx_3} & ], \end{bmatrix} ``` Each {math}`\frac{dJ_i}{dx_j}` is a NumPy array with shape {code}`(*J_i.shape, *x_j.shape)`. ### Jacobian Matrix Accumulation The computation of the Jacobian matrix on a tape has each block multiply an incoming Jacobian with its own Jacobian. This is most easily understood with an example. Let {math}`\mathbf{x}` be a set of inputs and {math}`\mathbf{g}(\mathbf{y})` be a block on the tape. The {math}`\mathbf{g}` block knows its own Jacobian {math}`\frac{d\mathbf{g}}{d\mathbf{y}}`, and it must compute the Jacobian {math}`\frac{d\mathbf{g}}{d\mathbf{x}}` given the Jacobian {math}`\frac{d\mathbf{y}}{d\mathbf{x}}`. ```{math} \frac{d\mathbf{g}}{d\mathbf{x}} = \frac{d\mathbf{g}}{d\mathbf{y}} \frac{d\mathbf{y}}{d\mathbf{x}} ``` Each individual output Jacobian can be computed with the following formula: ```{math} \frac{dg_i}{dx_j} = \sum_k \frac{dg_i}{dy_k} \frac{dy_k}{dx_j} ``` Here, {math}`\frac{dg_i}{dy_k}` is a NumPy array with shape {code}`(*g_i.shape, *y_k.shape)`. and {math}`\frac{dy_k}{dx_j}` is a NumPy array with shape {code}`(*y_k.shape, *x_j.shape)`. These are tensors that must be contracted over the {}`y_k` dimensions. In NumPy, this contraction can be done with the following code: ```{code} np.tensordot(dgi_dyk, dyk_dxj, axes=len(yk.shape)) ``` The Jacobian {math}`\frac{dg_i}{d\mathbf{x}}` for each output variable {math}`g_i` is a list recorded in the corresponding BlockVariable instance using the {meth}`BlockVariable.add_tlm_matrix` function. The {meth}`Block.evaluate_tlm_matrix_component` should return the Jacobian (as a list) of one of the outputs of the block. [pyadjoint]: http://www.dolfin-adjoint.org/en/release/ [petsc]: https://www.mcs.anl.gov/petsc/