Saturday, January 9, 2016

On calling Ceres from Scala - 3.0: Who owns what?

This is the third installment of a series documenting my progress on porting the Ceres solver API to Scala. All my sources quoted below are available on github.


Who owns what?


The pesky issue of memory management rears its head whenever code must cross the boundary between JVM  and native code. Items to be dealt with include:
  • Are copies of the same data kept on both sides? If yes, are we wasting valuable RAM (thus handicapping our ability to scale to large problems)? If no, are we wasting computing time making repeated expensive calls across the JNI boundary?
  • If memory is heap-allocated on the native side to hold the data, is it properly reclaimed to avoid leaks - including in the case of failures on the native side that leave the JVM side running?
  • If the native side accesses memory allocated by the JVM, are the Java-side references reachable throughout the runtime of the native code? Otherwise the latter may try to access memory that's already been garbage-collected.
  • Is the JVM properly initialized? Depending on the expected data split between Java and native size of the programs, the max heap size will need to be properly sized with respect to the overall max process size.
SWIG is actually pretty good at keeping track of C++ allocations caused by Java, in that Java classes wrapping C++ ones are generated with a finalize method that calls operator delete on the pointer to the wrapped C++ instance. Therefore garbage collection will help avoid leaks (with the usual caveat that there are no guarantees on when finalizers are run). There are some subtleties to take into account, see the link above for more.

In our case, the chunks of memory of interest are:
  1. The parameter blocks, altogether comprising the state vector optimized by Ceres.
  2. Data used to compute the cost function for each residual block.
  3. The Jacobian matrices.
We really have no control on (3), short of rewriting big chunks of Ceres code, so storage for the Jacobians will be C++ owned, and we'll need to copy its content when required for the computation of the cost function derivatives JVM-side.

For (2) the design choice is a no brainer: these data are only "felt" by Ceres indirectly through the CostFunction interface, but they are not actually accessed by the library itself. Therefore there is really no reason for shifting their ownership to C++: the JVM can keep them just fine, thank you very much and see y'all later. 

We have some design freedom for (1), but it is rather easy to see that having that memory held by Java leads to significant code complexity for little gain. The parameter blocks need to be manipulated at a low level by the Ceres solver directly through pointers. If their memory were held by Java we'd need a shim of C code, including JNI env calls to GetArrayLength, GetDoubleArrayElements, to get in C++ the pointers to the Java arrays. In some JVM implementations these actually cause memory to be duplicated, but the saner ones prefer to "pin" the memory buffers to prevent the garbage collector from touching them. The logical place for such shims would be in the wrappers for Ceres's Problem::AddParameterBlock and Problem::AddResidualBlock. So we'd need to either explicit subclass Problem in C++ for the purposes of this port, or wrap it in a Decorator, or take advantage of SWIG's  %extend mechanism (which amounts to the same thing). However, we'd also need to release the Java memory from the C side every time we call the cost function, as well as when the solver is done, through a JNI env call to ReleaseDoubleArrayElements. If we don't do it, Java would either never "see" the result of the solver updates to the state (if the JVM copied the buffers), or never be able to GC the parameter blocks if they were pinned (thus leaking memory). Therefore we'd need another C shim, to be called during the optimization and at its end. The shim would need to be wrapped in Java, and there really isn't an obvious place for it. For the optimization end part, we could perhaps place it in the destructor of our Problem subclass (to be called by the finalizer of its Java wrap), but for this to work the subclass instance would need to keep track of all the parameter blocks it has handled, and this bookkeeping would further complicate the implementation of the AddParameterBloc/AddResidualBlock overloads. 

All in all, it seems much saner to let C++ own the parameter block memory, and just copy it out to Java when we are done. SWIG offers a nice helper for this approach in its carrays.i library: it allows one to allocate an array C-side and get an opaque pointer-like Java object for it, plus methods for reading from and writing to it. These methods cross the JNI boundary, so they are a not-so-cheap addition to the cost function evaluation, but this seems an acceptable price to pay at least for now.

So our design will have:
  1. Parameter blocks: C++ owned, copied to JVM upon cost function evaluation and after a solver's run.
  2. Cost function data: JVM-owned.
  3. Jacobians: C++ owned.
That's it for today. With this decision we are ready to attempt our first Ceres wrap.

No comments:

Post a Comment