User:Hakerh400/JavaScript: Unlimited recursion and TCO
In this tutorial we explain how to achieve unlimited recursion in JavaScript.
Introduction
At the time of writing this, all popular JavaScript engines allow only about ten thousand recursive calls. This is not a big issue for most developers, but it prevents using JavaScript as a fully functional language and makes implementing loops using recursive calls intractable. The EcmaScript standard requires the stack trace to be kept in case an exception occurs in order to allow the stack trace to be restored, so JavaScript engines that want to follow the standard can't really do any optimization in this regard.
We decided to introduce an algorithm that uses generator functions in order to achieve unlimited recursion depth. We do not use any exploit or specification violation, because the state of a generator function is stored in the heap and we do not allocate any extra space on the stack, so this is technically not a recursion according to the EcmaScript standard.
Unlimited recursion can be achieved even in older JavaScript versions that do not support generator functions, but it requires simulating a stack using pure JavaScript objects and usually does not have a good JavaSCript syntax structure. In this article, we focus only on modern JavaScript engines.
Generator functions
In JavaScript, a generator function is a function that is able to produce some value before actually returning the final result. When a generator function produces a value, its state is copied into a generator object and removed from the stack. The generator function becomes paused, so we can save the generator object and continue its execution later. Except from being able to produce a result, a generator is also able to take a value as argument when it resumes. This allows us to communicate with a generator object without returning from the generator function or invoking a new function while the generator object's state is still on the stack.
Tail call optimization
When the return value of a function contains a single call to another function, there is no need to keep the current function on the stack. We can simply rewire the new call to the previous call and remove the current call from the stack.
Implementation
const O = { kTco: Symbol('tco'), tco(...args){ return [O.kTco, ...args]; }, rec(f, ...args){ const {kTco} = O; const makeStackFrame = (f, args) => { const func = typeof f === 'function' ? f : f[0][f[1]]; const gen = func === f ? func.apply(null, args) : func.apply(f[0], args) return [gen, null]; }; const stack = [makeStackFrame(f, args)]; while(1){ const frame = O.last(stack); const [gen, val] = frame; const result = gen.next(val); const {done, value} = result; if(done){ if(stack.length === 1) return value; stack.pop(); O.last(stack)[1] = value; continue; } if(value[0] === kTco){ stack.pop(); value.shift(); } stack.push(makeStackFrame(value[0], value.slice(1))); } }, last(arr, defaultVal=null){ if(arr.length === 0) return defaultVal; return arr[arr.length - 1]; }, };
Also, a minified version:
const O = {a:[],tco:(...a)=>[O.a,...a],rec:(a,...b)=>((c=(a,b,c=a[0],d=a.map?c[a[1]]:a,e=d.apply(c,b ))=>[e],d=[c(a,b)],e=a=>d[d.length-1],g,h,i=(a,b)=>new Set([a]).forEach((a,c,d)=>a(d.delete(a))&&b(d .add(a))),k,l)=>(i(a=>([g,h]=e(),{done:k,value:l}=g.next(h),!k|d.length>1),a=>k?e(d.pop())[1]=l:(l[0 ]==O.a&&l.shift(d.pop()),d.push(c(l[0],l.slice(1))))),l))()};
Examples
Example 1
Suppose we want to create a function that adds two numbers recursively (by "numbers" we mean non-negative integers). We can do it using a simple function:
const f = (x, y) => { if(x === 0) return y; return 1 + f(x - 1, y); }; console.log(f(123, 45));
It works for small integers (less that few thousands), but if x
is few millions, it triggers a stack overflow exception.
However, if we transform it into the following:
const f = function*(x, y){ if(x === 0) return y; return 1 + (yield [f, x - 1, y]); }; console.log(O.rec(f, 123, 45));
then it works for any number (up to JavaScript's max integer value, but for even larger integers we can use the BigInt type).
Basically, instead of direct call f(x - 1, y)
, we delegate the call to the O.rec
method using yield [f, x - 1, y]
syntax.
Example 2
In this example we also want to add two numbers, but this time we will use a bit different approach:
const f = function*(x, y){ if(x === 0) return y; return yield [f, x - 1, y + 1]; }; console.log(O.rec(f, 123, 45));
It gives the correct result, but it is unnecessarily slow and takes too much memory, because it keeps all generator objects until the very end. We can introduce the tail call optimization by transforming it into the following form:
const f = function*(x, y){ if(x === 0) return y; yield O.tco(f, x - 1, y + 1); }; console.log(O.rec(f, 123, 45));
Now it is probably the most efficient way to add two numbers recursively. This can be used in the implementation of some esoteric languages in which natural numbers are represented as linked list (empty list is zero, a single element is one, two elements are two, etc).
Example 3
In this example we want to stringify an expression. We define expression to be either an integer, or a pair of two expressions (for example (expr1, expr2)
, where expr1
and expr2
are some other expressions). We can define corresponding classes:
class Expression{} class Integer extends Expression{ constructor(value){ super(); this.value = value; } *stringify(){ return String(this.value); } } class Pair extends Expression{ constructor(left, right){ super(); this.left = left; this.right = right; } *stringify(){ const left = yield [[this.left, 'stringify']]; const right = yield [[this.right, 'stringify']]; return `(${left}, ${right})`; } }
Now we can construct a new expression (5, (7, 34))
and stringify it:
const expr = new Pair( new Integer(5), new Pair(new Integer(7), new Integer(34)) ); console.log(O.rec([expr, 'stringify']));
The key point to notice here is that methods that are bound to a specific object must be encapsulated into an array when passing to O.rec
or yield
. So, instead of yield [obj.method, ...args]
we use yield [[obj, 'method'], ...args]
. Alternatively, you can use yield [obj.method.bind(obj), ...args]
, but it is a bit slower.