Sunday, January 31, 2016

On calling Ceres from Scala - 5.0: Autodiffin'it with spire

This is the sixth installment of a series documenting my progress on porting the Ceres solver API to Scala. All my sources quoted below are available on github.

Autodiffin'it with spire


One of the most useful features of the Ceres solver is its support for the automatic differentiation (in forward-mode) of vector cost functions of multiple vector arguments. This is entirely implemented caller-side with respect to the optimizer, through the use of a pair of templated adaptors (high and low-level ones). They work together to "convert" a Ceres cost function into a templated functor, written by the user, that the can be called with one of two different types of arguments: ordinary double-valued parameter blocks for just computing the values of the residuals, and parameter blocks whose elements are Jet-valued for computing both the residuals and their partial derivatives with respect to the parameters. 

So, the optimizer only "sees" an ordinary cost function, but:
  • When it calls it with a NULL jacobian matrix, the adapter chooses the double-only specialization of the user-provided cost functor, and returns the straight residuals to the optimizer. 
  • Otherwise, it packs the parameter values into Jet's, calls the Jet-typed specialization of the functor, and then unpacks the result into residuals and jacobians.

Unfortunately none of these caller-side goodies can directly be used in our SWIG-based port, since SWIG cannot specialize the C++ templates for as-yet undefined implementations of the cost functors, to be provided by the users (or, to repeat the usual mantra, Java generics are not the same as C++ templates). Sad!

However, we can take advantage of Ceres's clean separation of the autodiff implementation from the core optimizer code. Autodiff is math, and there are plenty of excellent numerical math libraries for the JVM. A particularly good one in the Scala world is spire, whose authors have gone through extreme lengths toward providing powerful type-safe mathematical abstractions while sacrificing little or nothing in the way of execution performance. This article and this YouTube video provide good introductions to the library.

Some time ago, in preparation for this port, I contributed to spire an implementation of the Jet type that seamlessly integrates with the rest of spire's algebraic types as well as its generic implementations of the standard math libraries. Now we get to use it. 

The basic idea is to replicate in a bit of Scala code the with/without jacobian code switch described above. Let's first define a CostFunctor base class, parametrized on the argument type, which the user can later override with their particular implementation of a residual.

package org.somelightprojections.skeres
import scala.reflect.ClassTag
import spire.algebra.{Trig, Field}
// Generic autodifferentiable cost functor:
// kNumResiduals: output dimension.
// N: sequence of integers, ordinately equal to the
// dimensions of the parameters blocks.
abstract class CostFunctor(val kNumResiduals: Int, val N: Int*) {
require(kNumResiduals > 0, s"Nonpositive number of residuals specified: $kNumResiduals")
require(N.forall(_ > 0), s"Nonpositive parameter block sizes specified: ${N.mkString(", ")}")
// Returns an auto-differentiable cost function computed through this functor.
def toAutodiffCostFunction = AutodiffCostFunction(this)
// Evaluate the functor on a sequence of N.length parameter blocks, ordinately of N(k) size,
// for k = 0, 1, ..., N.length.
def apply[@specialized(Double) T : Field : Trig : ClassTag](x: Array[T]*): Array[T]
}
view raw CostFunctor hosted with ❤ by GitHub

This defines a function representing a cost term of dimension "kNumResiduals", dependent on a variadic number of parameter blocks, each of size N(0), N(1), etc., so that the parameter space is composed of N.length parameter blocks. The parameter blocks themselves are passed to the cost evaluation method as Array's parametrized on a type T which, through the given typeclasses, allow for algebraic calculations on their components.

The conversion method "toAutodiffCostFunction" produces the actual auto-differentiable cost function. It is defined as follows:

package org.somelightprojections.skeres
import com.google.ceres.CostFunction
import spire.implicits._
import spire.math.{Jet, JetDim}
// Sized cost function
abstract class SizedCostFunction(val kNumResiduals: Int, val N: Int*) extends CostFunction {
require(N.forall(_ >= 0), s"Negative block size detected. Block size are: ${N.mkString(", ")}")
require(N.indices.tail.forall(i => N(i) == 0 || N(i - 1) > 0),
"Zero block cannot precede a non-zero block. Block sizes are (ignore trailing 0's): " +
N.mkString(", ")
)
setNumResiduals(kNumResiduals)
N.foreach(mutableParameterBlockSizes.add)
}
// Autodifferentiable cost function
case class AutodiffCostFunction(costFunctor: CostFunctor)
extends SizedCostFunction(costFunctor.kNumResiduals, costFunctor.N: _*) {
implicit val jetDimension = JetDim(costFunctor.N.sum)
override def evaluate(
parameters: DoublePointerPointer,
residuals: DoublePointer,
jacobians: DoublePointerPointer
): Boolean = {
val numParameterBlocks = costFunctor.N.length
if (jacobians.isNull) {
val x = Array.ofDim[Array[Double]](numParameterBlocks)
cforRange(0 until numParameterBlocks) { i =>
x(i) = parameters.getRow(i).toArray(costFunctor.N(i))
}
val y: Array[Double] = costFunctor(x: _*)
if (y.isEmpty) {
false
} else {
residuals.copyFrom(y)
true
}
} else {
val jx = Array.fill[Array[Jet[Double]]](numParameterBlocks)(null)
var k = 0
cforRange(0 until numParameterBlocks) { i =>
val n = costFunctor.N(i)
val xi = parameters.getRow(i)
val jxi = Array.ofDim[Jet[Double]](n)
cforRange(0 until n) { j =>
jxi(j) = Jet[Double](xi.get(j), k)
k += 1
}
jx(i) = jxi
}
val jy: Array[Jet[Double]] = costFunctor(jx: _*)
if (jy.isEmpty) {
false
} else {
cforRange(0 until costFunctor.kNumResiduals) { i =>
val jyi = jy(i)
residuals.set(i, jyi.real)
if (!jacobians.isNull && jacobians.hasRow(i)) jacobians.copyRowFrom(i, jyi.infinitesimal)
}
true
}
}
}
}

There is quite a bit of code there, but it should be fairly easy to parse.

First I define, following Ceres, a SizedCostFunction that makes explicit the input and output dimensions of a CostFunction, then I extend it into an AutodiffCostFunction, which is constructed from a given CostFunctor (hence with known such dimensions).

The evaluate method of AutodiffCostFunction implements the CostFunction interface. Note that it consists of two main branches, respectively for the case of pure cost evaluation (no jacobians), and cost with partial derivatives. In the second case I "convert" the input parameter blocks into Jet's set up for computing the associated derivatives, and then map back the functor's output to residuals and jacobians.

A bit of extra complication, present in the Ceres C++ code as well, is due to the fact that we want to differentiate a function of several separate blocks, whereas the Jet framework is designed for computing partial derivatives with respect to variables represented in a single container. Therefore we must "unroll" the blocks prior to calling the cost functor, and manage the variable indices accordingly.

Testing it


We can test all the above by re-implementing in Scala the Powell function example from the Ceres distribution:

package org.somelightprojections.skeres
import com.google.ceres._
import scala.reflect.ClassTag
import scala.{specialized => sp}
import spire.algebra._
import spire.implicits._
import spire.math._
object Powell {
// f1 = x1 + 10 * x2;
object F1 extends CostFunctor(1, 1, 1) {
override def apply[@sp(Double) T: Field: Trig: ClassTag](x: Array[T]*): Array[T] = {
val x1 = x(0)
val x2 = x(1)
val y = x1(0) + 10.0 * x2(0)
Array(y)
}
}
// f2 = sqrt(5) (x3 - x4)
object F2 extends CostFunctor(1, 1, 1) {
override def apply[@sp(Double) T: Field: Trig: ClassTag](x: Array[T]*): Array[T] = {
val x3 = x(0)
val x4 = x(1)
val y = sqrt(5.0) * x3(0) - x4(0)
Array(y)
}
}
// f3 = (x2 - 2 x3)^2
object F3 extends CostFunctor(1, 1, 1) {
override def apply[@sp(Double) T: Field: Trig: ClassTag](x: Array[T]*): Array[T] = {
val x2 = x(0)
val x3 = x(1)
val d = x2(0) - 2.0 * x3(0)
val y = d * d
Array(y)
}
}
// f4 = sqrt(10) (x1 - x4)^2
object F4 extends CostFunctor(1, 1, 1) {
override def apply[@sp(Double) T: Field: Trig: ClassTag](x: Array[T]*): Array[T] = {
val x1 = x(0)
val x4 = x(1)
val d = x1(0) - x4(0)
val y = sqrt(10) * d * d
Array(y)
}
}
def main(args: Array[String]): Unit = {
ceres.initGoogleLogging("Powell")
val initialX = Vector(3.0, -1.0, 0.0, 1.0)
val x1 = new DoubleArray(1)
x1.set(0, initialX(0))
val x2 = new DoubleArray(1)
x2.set(0, initialX(1))
val x3 = new DoubleArray(1)
x3.set(0, initialX(2))
val x4 = new DoubleArray(1)
x4.set(0, initialX(3))
val loss = PredefinedLossFunctions.trivialLoss
val problemOptions = new Problem.Options
problemOptions.setCostFunctionOwnership(Ownership.DO_NOT_TAKE_OWNERSHIP)
problemOptions.setLossFunctionOwnership(Ownership.DO_NOT_TAKE_OWNERSHIP)
val problem = new Problem(problemOptions)
problem.addResidualBlock(F1.toAutodiffCostFunction, loss, x1, x2)
problem.addResidualBlock(F2.toAutodiffCostFunction, loss, x3, x4)
problem.addResidualBlock(F3.toAutodiffCostFunction, loss, x2, x3)
problem.addResidualBlock(F4.toAutodiffCostFunction, loss, x1, x4)
val options = new Solver.Options
options.setMinimizerType(MinimizerType.TRUST_REGION)
options.setMinimizerProgressToStdout(true)
options.setMaxNumIterations(100)
options.setLinearSolverType(LinearSolverType.DENSE_QR)
println(s"Initial: ${initialX.mkString(", ")}")
val summary = new Solver.Summary
ceres.solve(options, problem, summary)
val xout = Vector(x1, x2, x3, x4).flatMap(_.toArray(1))
println(summary.briefReport)
println(s"Final: ${xout.map(x => "%.3g".format(x)).mkString(", ")}")
}
}
view raw Powell hosted with ❤ by GitHub

Running it we get the same results as the C++ version:
/Users/francesco/Library/Java/JavaVirtualMachines/1.7.0_10/Contents/Home/bin/java -Didea.launcher.port=7533 "-Didea.launcher.bin.path=/Applications/IntelliJ IDEA 14 CE.app/Contents/bin" -Dfile.encoding=UTF-8 -classpath "/Users/francesco/Library/Java/JavaVirtualMachines/1.7.0_10/Contents/Home/lib/ant-javafx.jar:/Users/francesco/Library/Java/JavaVirtualMachines/1.7.0_10/Contents/Home/lib/dt.jar:/Users/francesco/Library/Java/JavaVirtualMachines/1.7.0_10/Contents/Home/lib/javafx-doclet.jar:/Users/francesco/Library/Java/JavaVirtualMachines/1.7.0_10/Contents/Home/lib/javafx-mx.jar:/Users/francesco/Library/Java/JavaVirtualMachines/1.7.0_10/Contents/Home/lib/jconsole.jar:/Users/francesco/Library/Java/JavaVirtualMachines/1.7.0_10/Contents/Home/lib/sa-jdi.jar:/Users/francesco/Library/Java/JavaVirtualMachines/1.7.0_10/Contents/Home/lib/tools.jar:/Users/francesco/Library/Java/JavaVirtualMachines/1.7.0_10/Contents/Home/jre/lib/charsets.jar:/Users/francesco/Library/Java/JavaVirtualMachines/1.7.0_10/Contents/Home/jre/lib/deploy.jar:/Users/francesco/Library/Java/JavaVirtualMachines/1.7.0_10/Contents/Home/jre/lib/htmlconverter.jar:/Users/francesco/Library/Java/JavaVirtualMachines/1.7.0_10/Contents/Home/jre/lib/javaws.jar:/Users/francesco/Library/Java/JavaVirtualMachines/1.7.0_10/Contents/Home/jre/lib/jce.jar:/Users/francesco/Library/Java/JavaVirtualMachines/1.7.0_10/Contents/Home/jre/lib/jfr.jar:/Users/francesco/Library/Java/JavaVirtualMachines/1.7.0_10/Contents/Home/jre/lib/jfxrt.jar:/Users/francesco/Library/Java/JavaVirtualMachines/1.7.0_10/Contents/Home/jre/lib/JObjC.jar:/Users/francesco/Library/Java/JavaVirtualMachines/1.7.0_10/Contents/Home/jre/lib/jsse.jar:/Users/francesco/Library/Java/JavaVirtualMachines/1.7.0_10/Contents/Home/jre/lib/management-agent.jar:/Users/francesco/Library/Java/JavaVirtualMachines/1.7.0_10/Contents/Home/jre/lib/plugin.jar:/Users/francesco/Library/Java/JavaVirtualMachines/1.7.0_10/Contents/Home/jre/lib/resources.jar:/Users/francesco/Library/Java/JavaVirtualMachines/1.7.0_10/Contents/Home/jre/lib/rt.jar:/Users/francesco/Library/Java/JavaVirtualMachines/1.7.0_10/Contents/Home/jre/lib/ext/dnsns.jar:/Users/francesco/Library/Java/JavaVirtualMachines/1.7.0_10/Contents/Home/jre/lib/ext/localedata.jar:/Users/francesco/Library/Java/JavaVirtualMachines/1.7.0_10/Contents/Home/jre/lib/ext/sunec.jar:/Users/francesco/Library/Java/JavaVirtualMachines/1.7.0_10/Contents/Home/jre/lib/ext/sunjce_provider.jar:/Users/francesco/Library/Java/JavaVirtualMachines/1.7.0_10/Contents/Home/jre/lib/ext/sunpkcs11.jar:/Users/francesco/Library/Java/JavaVirtualMachines/1.7.0_10/Contents/Home/jre/lib/ext/zipfs.jar:/Users/francesco/src/ceres-solver/jni/swig/target/scala-2.11/classes:/Users/francesco/.ivy2/cache/org.scala-lang/scala-library/jars/scala-library-2.11.7.jar:/Users/francesco/.ivy2/cache/org.scala-lang/scala-reflect/jars/scala-reflect-2.11.7.jar:/Users/francesco/.ivy2/cache/org.spire-math/spire-macros_2.11/jars/spire-macros_2.11-0.11.0.jar:/Users/francesco/.ivy2/cache/org.spire-math/spire_2.11/jars/spire_2.11-0.11.0.jar:/Users/francesco/.ivy2/cache/org.typelevel/machinist_2.11/jars/machinist_2.11-0.4.1.jar:/Users/francesco/.ivy2/cache/com.chuusai/shapeless_2.11/bundles/shapeless_2.11-2.2.5.jar:/Applications/IntelliJ IDEA 14 CE.app/Contents/lib/idea_rt.jar" com.intellij.rt.execution.application.AppMain org.somelightprojections.skeres.Powell
Initial: 3.0, -1.0, 0.0, 1.0
iter cost cost_change |gradient| |step| tr_ratio tr_radius ls_iter iter_time total_time
0 1.055000e+02 0.00e+00 1.59e+02 0.00e+00 0.00e+00 1.00e+04 0 8.03e-02 8.90e-02
1 5.036813e+00 1.00e+02 2.00e+01 1.87e+00 9.52e-01 3.00e+04 1 5.32e-03 9.44e-02
2 3.148695e-01 4.72e+00 2.50e+00 7.76e-01 9.37e-01 9.00e+04 1 1.56e-03 9.60e-02
3 1.968157e-02 2.95e-01 3.13e-01 3.74e-01 9.37e-01 2.70e+05 1 1.17e-03 9.72e-02
4 1.230169e-03 1.85e-02 3.91e-02 1.87e-01 9.37e-01 8.10e+05 1 1.16e-03 9.84e-02
5 7.688901e-05 1.15e-03 4.89e-03 9.33e-02 9.37e-01 2.43e+06 1 1.18e-03 9.96e-02
6 4.805799e-06 7.21e-05 6.11e-04 4.67e-02 9.37e-01 7.29e+06 1 1.20e-03 1.01e-01
7 3.003810e-07 4.51e-06 7.63e-05 2.33e-02 9.37e-01 2.19e+07 1 1.23e-03 1.02e-01
8 1.877534e-08 2.82e-07 9.54e-06 1.17e-02 9.37e-01 6.56e+07 1 1.16e-03 1.03e-01
9 1.173586e-09 1.76e-08 1.19e-06 5.84e-03 9.37e-01 1.97e+08 1 1.09e-03 1.04e-01
10 7.335973e-11 1.10e-09 1.49e-07 2.92e-03 9.37e-01 5.90e+08 1 6.44e-04 1.05e-01
11 4.585868e-12 6.88e-11 1.86e-08 1.46e-03 9.37e-01 1.77e+09 1 6.46e-04 1.06e-01
12 2.866907e-13 4.30e-12 2.33e-09 7.31e-04 9.37e-01 5.31e+09 1 6.39e-04 1.06e-01
13 1.792435e-14 2.69e-13 2.91e-10 3.66e-04 9.37e-01 1.59e+10 1 6.61e-04 1.07e-01
Ceres Solver Report: Iterations: 14, Initial cost: 1.055000e+02, Final cost: 1.120790e-15, Termination: CONVERGENCE
Final: 0.000174, -1.74e-05, 2.34e-05, 5.24e-05
Process finished with exit code 0
view raw PowellRun hosted with ❤ by GitHub

As before, the timing does not reflect JIT optimizations.

Next steps


Getting auto-differentiation was the last nontrivial step of this port. From now on it's just "plumbing". My next post will point to a first version of the entire thing.

No comments:

Post a Comment