This is the sixth installment of a
series documenting my progress on porting the Ceres solver API to Scala. All my sources quoted below are available on
github.
Autodiffin'it with spire
One of the most useful features of the Ceres solver is its support for the
automatic differentiation (in forward-mode) of vector cost functions of multiple vector arguments. This is entirely implemented caller-side with respect to the optimizer, through the use of a pair of templated adaptors (
high and
low-level ones). They work together to "convert" a Ceres cost function into a templated functor, written by the user, that the can be called with one of two different
types of arguments: ordinary double-valued parameter blocks for just computing the values of the residuals, and parameter blocks whose elements are
Jet-valued for computing both the residuals and their partial derivatives with respect to the parameters.
- When it calls it with a NULL jacobian matrix, the adapter chooses the double-only specialization of the user-provided cost functor, and returns the straight residuals to the optimizer.
- Otherwise, it packs the parameter values into Jet's, calls the Jet-typed specialization of the functor, and then unpacks the result into residuals and jacobians.
Unfortunately none of these caller-side goodies can directly be used in our SWIG-based port, since SWIG cannot specialize the C++ templates for as-yet undefined implementations of the cost functors, to be provided by the users (or, to repeat the usual mantra,
Java generics are not the same as C++ templates). Sad!
However, we can take advantage of Ceres's clean separation of the autodiff implementation from the core optimizer code. Autodiff is math, and there are plenty of excellent numerical math libraries for the JVM. A particularly good one in the Scala world is
spire, whose authors have gone through extreme lengths toward providing powerful type-safe mathematical abstractions while sacrificing little or nothing in the way of execution performance. This
article and this
YouTube video provide good introductions to the library.
Some time ago, in preparation for this port, I contributed to spire an
implementation of the Jet type that seamlessly integrates with the rest of spire's algebraic types as well as its generic implementations of the standard math libraries. Now we get to use it.
The basic idea is to replicate in a bit of Scala code the with/without jacobian code switch described above. Let's first define a CostFunctor base class, parametrized on the argument type, which the user can later override with their particular implementation of a residual.
This defines a function representing a cost term of dimension "kNumResiduals", dependent on a variadic number of parameter blocks, each of size N(0), N(1), etc., so that the parameter space is composed of N.length parameter blocks. The parameter blocks themselves are passed to the cost evaluation method as Array's parametrized on a type T which, through the given typeclasses, allow for algebraic calculations on their components.
The conversion method "toAutodiffCostFunction" produces the actual auto-differentiable cost function. It is defined as follows:
There is quite a bit of code there, but it should be fairly easy to parse.
First I define, following Ceres, a SizedCostFunction that makes explicit the input and output dimensions of a CostFunction, then I extend it into an AutodiffCostFunction, which is constructed from a given CostFunctor (hence with known such dimensions).
The evaluate method of AutodiffCostFunction implements the CostFunction interface. Note that it consists of two main branches, respectively for the case of pure cost evaluation (no jacobians), and cost with partial derivatives. In the second case I "convert" the input parameter blocks into Jet's set up for computing the associated derivatives, and then map back the functor's output to residuals and jacobians.
A bit of extra complication, present in the Ceres C++ code as well, is due to the fact that we want to differentiate a function of several separate
blocks, whereas the Jet framework is designed for computing partial derivatives with respect to variables represented in a single container. Therefore we must "unroll" the blocks prior to calling the cost functor, and manage the variable indices accordingly.
Testing it
We can test all the above by re-implementing in Scala the Powell function example from the Ceres distribution:
Running it we get the same results as the C++ version:
As before, the timing does not reflect JIT optimizations.
Next steps
Getting auto-differentiation was the last nontrivial step of this port. From now on it's just "plumbing". My next post will point to a first version of the entire thing.