I made some minor changes to the details of expression evaluation a few days ago, and found that this broke the code for optimizing tail calls in recursive functions. Since that code has a nasty habit of breaking when I make small changes in this area, I decided that it was time for an overhaul.
The basic idea of tail call optimization is that when a function calls itself as the last thing that it does, you don't actually need to create a new stack frame, because the old one can be reused. The main benefit is that you can recurse indefinitely deep without blowing the stack. You can tell products that don't include this optimization because they typically run out of stack space after 500 recursive calls or so.
In the old design, Saxon wasn't actually reusing the original stack frame (it was just deleting the old stack frame before creating the new one). The tail-call was recognized and marked as such at compile time, and when it was executed, instead of calling the function, Saxon would create an object that encapsulated details of the function to be called, its parameters, and other bits of context. This would be passed down to the original caller of the recursive function, which would then execute the displaced function call, and carry on doing so until a "real" result was returned. The fragility of the mechanism came because there was a tendency either to intercept the function call package too early on its way down the stack, or to fail to intercept it when the time came for it to be caught by the original caller.
Another recognized problem with the approach was that although Saxon was conserving space on the Java stack, it wasn't conserving space on the heap. Each function call created a new XPathContext object, which was linked to the XPathContext of its caller, leading to a chain of XPathContext objects in the heap as long as the depth of recursion.
In the new design, the tail calls are still recognized and marked at compile time. In addition, if the function contains any tail calls, then its body is wrapped in a new TailCallLoop object. When the tail call is executed, the parameters of the function call are evaluated and written to the appropriate slots on the local stack frame. A marker is set in the XPathContext object, and the function call returns an empty sequence. This return value finds its way down to the TailCallLoop object, which executes the "real" body of the function repeatedly so long as it finds the marker in the XPathContext set to true.
The new design appears to save significantly on creation of objects during the tail-call cycle, in particular the context objects and the function call package itself. It also seems a lot more robust in that the number of places in the code that need to understand the mechanism is much reduced. There is one downside: the old mechanism handled tail calls to any function, not just directly-recursive tail calls. This was sometimes useful in situations where two functions are mutually recursive.
I think the new design can be extended to handle functions that are "almost tail-recursive", in that they perform a simple operation (typically a comparison, an addition, or a list append) when the tail call returns. An example of such a function is:
declare function product($seq) {
if (empty($seq)) then 1 else $seq[1]*product($seq[position()!=1])
}
(which can be useful in calculating compound interest)
There is a known technique to rewrite such functions as tail recursive functions by adding an extra parameter:
declare function product2($seq, $n) {
if (empty($seq)) then $n else product($seq[position()!=1], $n*$seq[1])
}
and this rewrite is now looking like a feasible proposition.
Functions of course work exactly the same way in XSLT and XQuery. In XSLT there are also named templates and match templates. Saxon implements tail-call optimization on these, using a similar but separate technique to the old design for functions (in fact, this is where it came from). It's harder to see how these can be redesigned in the same way. Named templates could perhaps be turned into functions, though the need to maintain full context information, tunnel parameters and the like makes this difficult. Match templates are more difficult still because it is impossible to tell statically whether a tail call will be recursive or not. In any case, the implementation for templates seems to be less fragile, perhaps because templates are always invoked in "push" mode. In push mode, instructions don't actually return a result, instead they write to the current output destination. This means that the Java return value can be used to pass information about tail calls down the stack without the kind of overloading that was taking place for function calls.
|
|
||||
|
Tail recursive functions
Comments
Re: Tail recursive functions
by
Dimitre Novatchev
on Wed 02 Aug 2006 02:37 BST | Permanent Link
I think it would be very useful to extend this approach to handle not only a function calling itself as the last thing done but also a function calling any other function as the last thing done.
The only change from a normal call is to provide as return address to the function we are calling our own return address (that is the address of the point after we were called by our caller). This, of course, solves also the problem of indirect (circular) recursion. Dimitre Novatchev Re: Re: Tail recursive functions
Thanks for the comment, Dimitre. Yes, I'll look into this. I need to think carefully about the implications for type-checking. It might just fall out, I'm not sure. (I think that if any run-time checking of the result of a function body is needed, then the function isn't tail-recursive by definition; so perhaps it's OK.) The only other difference is that the stack frame may need to be resized, and the stack frame map used by debuggers needs to be switched.
Re: Re: Tail recursive functions
Further thoughts on tail calls to a different function. It's easy enough to reuse the XPathContext object and the stackframe that it contains when calling a different function. But these objects are on the Java heap, which is rarely a cause of problems. It's not so easy to do a tail call to a different function without consuming Java stack space. Saxon was doing this by returning to the original caller of the first function with a request to call the second one, and this is where the design became fragile.
However, I think I can reduce the amount of Java stack space used by such calls (as well as reducing heap space) by returning all the way down to the outermost level of the function before doing the tail call. This is probably still worthwhile. Re: Tail recursive functions
by
Bas de Bakker
on Thu 03 Aug 2006 08:30 BST | Permanent Link
Note that the transformation you mention near the end turns evaluation from (x1 * (x2 * x3)) into ((x1 * x2) * x3). So it only works if the simple operation is associative. And multiplication and addition on computer floating point numbers is not associative, unlike the respective operations on real numbers.
Re: Re: Tail recursive functions
Yes: edge cases are always a pain when doing optimizations. The XQuery spec is a bit ambivalent about this: it gives a lot of license for rewrites to change the error behavior, but is a bit ambivalent about the effect on precision: the note in 3.5.1 about transitivity of value comparisons is essentially saying "we know there's a problem because value comparisons aren't precisely transitive [and therefore rewrites might change the result], and we're not telling you what to do about it". This one seems a similar case.
|
Search
Recent Comments
Recent Articles
Month Archive
|
|||