Groovy 1.8 introduces two new closure functions: memoization and trampolining. Memoization is the automatic caching of closure results. Trampolining permits a form of declarative tail-call optimization. This article introduces the two concepts and demonstrates how to combine them in order to create cached, tail-recursive closures.
The example code from this article is available on github.
Creating a closure that caches the result of some calculation is as easy as appending .memoize() or one of the alternate memoize…(…) methods that allow for more fine-grained control over cache sizes to a closure declaration. One benefit of memoization includes the caching of results of long-running calculations that have no side effects. Memory leaks are a potential pitfall, which is why a maximum cache size should generally be preferred.
The specification below contains a closure with a side effect. This side effect happens just once, despite the closure being invoked twice.
Suppose we want to memoize the results of a recursive closure call. For example, we can unroll the call tree of this naive implementation of the Fibonacci function.
The call trace for the fourth Fibonacci number looks like this.
NB: The closure here is entered 9 times, but we can see that it only needs to be entered 5 times because 4 calls are repeated. If we cache the results of the Fibonacci calls, the complexity of even a naive implementation such as this will increase linearly, rather than exponentially.
Unfortunately, because in Groovy 1.8 a closure cannot invoke another closure directly, memoizing this closure is not entirely straightforward. The example below does not work.
The stack trace when a closure calls itself.
Since methods can invoke memoized closures, the solution is to invoke the call method on the closure.
NB: The un-memoized version enters the closure 177 times, but the memoized version enters just 11.
Declarative tail-call optimization is as simple as adding .trampoline() to a closure declaration and ensuring that recursive calls to the closure invoke the trampoline method on the closure instead of the closure itself. Some problems are more clearly solved with recursive solutions, but without automatic tail-call optimization in the JVM, recursion can quickly lead to an explosion in the size of the call stack. Trampolining is one work-around for this design tradeoff (or defect).
A tail-recursive fibonacci closure.
By tracing the call stack, we can see its linear growth without memoization.
In order to avoid a java.lang.StackOverflowError for sufficiently large inputs, the tail-recursive closure must be explicitly trampolined.
In this example, the trampolined closure is called 1001 times. If a method in Java were to call itself 1001 times, the stack would be overwhelmed. By trampolining, Groovy turns this recursion into iteration.
Memoizing a Trampolined Closure
Suppose we want to cache the results of a computationally expensive tail-recursive closure with no side effects. A trampolined closure that calls itself can easily be memoized in a separate closure declaration.
NB: This solution caches the top-level trampolined closure, not the results of the intermediate calls.
Trampolining a Memoized Closure
Suppose we want to cache the intermediate results of a trampolined function call. For example, in our trace above, suppose we want to cache the results of fib 4 and fib 3, 1, 1 and fib 2, 1, 2 and fib 1, 2, 3 and fib 0, 3, 5. Again, this situation is not straightforward because no closure can call a memoized closure directly, so in this case, the trampolined closure cannot call a memoized version of itself directly. Again, we can use the call function on the memoized closure.
Trampolining and memoization are two powerful new features in Groovy 1.8, but the combination of the two is not always straightforward. This tutorial has presented strategies for combining the two and for working around some of the limitations of their combination.