Blindfolded Arithmetic
Blindfolded Arithmetic is an esoteric language that User:ais523 has published in 201804 on Code Golf Stack Exchange. It is similar to a language called Babbage's Analytical Engine that was published in 1992 for IOCCC. Blindfolded Arithmetic is Turingcomplete, in fact it can efficiently simulate an unrestricted state machine with two stacks.
Contents
Definition
Blindfolded Arithmetic operates on integers of unlimited size. The machine state has six registers each storing an integer, called a, b, c, d, e, i, plus a program counter.
Each instruction does a simple arithmetic operation, with two registers as input and one register as output:

v_0 = v_1 + v_2

v_0 = v_1  v_2

v_0 = v_1 * v_2

v_0 = v_1 / v_2
Division rounds the quotient towards zero. There are no flow control instructions. The machine executes the list of instructions cyclically forever, until it encounters a division instruction that tries to divide by zero. That instruction is not executed, and the content of the register i before that instruction is the output.
Execution takes one input argument, which must be a positive integer. At the start, the register i is initialized to the input, and the other five integer registers are initialized to 0.
Proof of Turingcompleteness
Blindfolded Arithmetic is Turingcomplete and can handle arbitrary computable functions from positive integers to positive integers.
To write a program to Blindfolded Arithmetic, write the program to a state machine with two stacks with two symbols (0 and 1) each. You can use arbitrary control flow and as many states as you wish. (Note that a Turingmachine with one tape and two symbols can be easily transformed to a state machine with two stacks: the first stack stores the part of the tape to the left of the head, and the second one stores the part of the tape under and to the right of the head.)
The stacks are padded with infinitely many zeroes. The first stack starts empty, but the second stack starts with the input minus one stored in it in binary, with the least significant bit on the stack top. You must end with the output minus one in the second stack in the same format. You are allowed to test not for the stack top, but also whether the second stack is empty, i.e., contains all zeroes. This format is unusual, but it doesn't restrict the power of the program, because you can start the program by transfering the input from the second stack to the first stack, converting it to a format with multiple symbols per bit, and with special combinations of symbols marking its ends, and do the opposite conversion at the end of the program to produce output.
General structure of compilation
We will now show a simple way to compile the state machine program to an Blindfolded Arithmetic program, using five of the six registers. Assign an integer identifier to each state of the state machine, with the restriction that the starting state is numbered 0. The register usage of the Blindflolded arithmetic will be the following.
 e is unused.
 d stores the first stack in binary, least significant bit is the stack top
 i1 stores the second stack in binary, least significant bit is the stack top. i must thus be positive.
 c usually stores 4*p, where p is the current state of state machine.
 a is a scratch register (accumulator) on which we do various arithmetic.
 b is a constant scratch register that usually contains a positive powers of two known at compile time, used to get around the difficulty that Blindfolded Arithmetic doesn't allow literals as operands; but can occasionally be used as a second scratch register.
We sometimes break these register conventions for short times within tight sequences of the program, but they must hold at the start of the program and between major steps.
An operation that we want to do very often in the Blindfolded Arithmetic program is to add an arbitrary constant integer B to a register x (one of a, c, d or i). We can do this easily any time when i is positive and we're allowed to clobber b. For this, write the magnitude of B in binary. Initialize b to 1 with the instruction b = i / i
. Then iterate each power of two from 1 to the value of the highest bit set in B. In each iteration, add the instruction x = x + b
if the bit is set in B and B is positive, x = x  b
if the bit is set in B and B is negative, or nothing if the bit is clear in abs(B), then change to the next power of two with the instruction b = b + b
. (It would be possible to only add 1 B times, but that would be inefficient.)
Another important operation is to decide if the program state counter p is equal to a particular state identifier P known at compile time. As precondition to this, we require that a is either 0 or 1, not necessarily known at compile time. Sometimes we always want to set a=1, which we do with the instruction a = i / i
. Other times we leave another 0 or 1 condition code in a. The output of this operation shall go to a, which becomes 1 if a was 1 at the input and P==p, but 0 in any other case. For this, first add 14*P to c, thus after the sequence, c=1+4*p4*P. Then a = a / c
. This can't be a division by zero, because c is now congrument to 1 modulo 4, and it resets a to 0 if P!=p. Finally restore the original value of c by adding 1+4*P to it.
Compiling individual steps
Now for each possible state P of the state machine, we have to emit instructions that do the intended effect on the simulated state counter and stacks if P==p, but leave the state of the simulated machine unchanged if P!=p. The instruction listing of the Blindfolded Arithmetic program has instructions for handling each state, and the instructions for one state have to be emitted together contiguously, but the order the states appear doesn't matter.
Almost all states transfer control to one or two different states, so let's see how to do that first.
If state P goes to state P', then we need to emit instructions that change c accordingly, but only if the current state p is P. So first set a to a condition code that is 1 if P=</i>p</i>, but 0 otherwise, in the way explained above. We then want to add 4*(P'P)*a to c. We can do this by iterating a through powers of two times the original condition code, doubling with a = a + a
, and adding or subtracting it to b according to the bits of 4*abs(P'P) with c = c + a
or c = c  a
. You have to emit these instructions for the control transfer immediately after any instructions that do the side effects.
Now consider a conditional branch. Suppose that the state P branches to P_0 if the condition is false, or to P_1 if the condition is true. To perform such a conditional branch, first check the condition in a way that it leaves 1 or 0 in a according to whether it is true or false. Then do the sequence that checks if the current state is P. Then add 4*(P_1P)*a to c, which branches to the landing pad if the condition is true. Finally emit jumps from P to P_1.
Now let's see the code for testing specific conditions.
 Checking if the second stack is empty is the simplest:
a = i / i; a = a / i
.  To check if the top element of the first stack is 1, do
b = i / i; b = b + b; a = d / b; a = a + a; a = d  a
. (If you want to avoid instructions that don't work in Babbage, writea = i / i; a = a + a; b = d / a; b = b + b; a = d  b
instead, even though that breaks our usual register usage conventions.)  Similarly, to check if the top element of the second stack is 0, do
b = i / i; b = b + b; a = i / b; a = a + a; a = d  a
.
All these sequences leave 1 in a if the condition is true, and 0 if it's false. All of them execute regardless of the state counter p, but this is not a problem since they have no side effects on the simulated machine.
Now for the states that do actions with side effects. The instructions for these all start by checking if the machine is executing that specific state, setting a to 1 if it is, and to 0 otherwise.
 To push 0 to the first stack, do
b = i / i; a = a + b; d = d * a
.  To push 1 to the first stack, do
b = i / i; a = a + b; d = d * a; a = a  b; d = d + a
.  To push 1 to the second stack, do
b = i / i; a = a + b; i = i * a
.  To push 1 to the second stack, do
b = i / i; a = a + b; i = i * a; a = a  b; i = i  a
.  To pop the first stack, do
b = i / i; a = a + b; d = d / a
.  To pop the second stack, do
i = i + a; b = i / i; a = a + b; i = i / a
.  To halt, assuming you've already put the correct output minus one to the second stack,
b = i / i; a = a  b; a = a / a
.
Four variables
It's probably possible to forego using one more variable if you accept somewhat more complicated Blindfolded Arithmetic code, but I haven't documented the details yet. TODO
Spare registers
When the above constructions leave registers unused, you can modify the constructions to use them to speed up the program. The easiest idea is to use a register as an extra stack. Three stacks let you efficiently copy, reverse, merge or sort vectors, any of which is slow with just two stacks. Alternately you can use a register as a pointer for an extra read head into existing stacks, which is still enough to speed up vector operations, but also easily lets you write an interpreter that reads code that is stored far away on a stack while the data is near the top of the stack.
Ais523's Turingcompleteness proof
We can prove this language Turingcomplete by translating Minsky machines into it. We'll use a formalization of Minsky machines with three counters (of which counter C contains the initial input), and the following set of commands, each of which is formed from a condition and an action:
 Conditions
 run unconditionally
 run if counter is 1; do nothing if the value is 2 or greater;
 undefined behaviour if the value is 0
 Actions
 increment counter
 decrement counter; decrementing 0 or 1 is undefined behaviour
 goto location X
 halt, outputting the value in counter C minus the value in counter A
It's fairly obvious that any Minsky machine can be written in this form (you need to start off by unconditionally incrementing counters A and B, then you simply treat a value of 1 as your "zero point" and never let the counters go lower). The limited conditionals can be made into arbitrary conditionals by conditionally running a goto command. Two counters is enough for Turingcompleteness, but for noninteractiveIOcompleteness (which is what the question wants), we need either three counters or a more complex I/O encoding. It's simplest to use three counters so that we don't have to pack and unpack the I/O encoding.
e
is used as the instruction pointer, pointing to the active command
in the Minsky Machine program. This is initially 0, so we need to
zeroindex our commands (i.e. the first command is IP 0). We use only
even numbers for instructions (with oddnumbered instruction pointers
conceptually pointing to implicit "noop" commands). Additionally, we
make it undefined behaviour to run off the end of the program (this is
not a problem as you could just put an unconditional goto there).
The counter values are stored in a
(A), b
(B), and i
(C).
Counter C is never allowed to reach zero (except temporarily for very
brief periods). c
is used as a flag to determine if the command
should run. d
is used as a temporary.
The basic idea is to write the implementation of each Minsky command
in sequence, where each command is implemented such that if e
has a
particular value, it will mimic the effect of that command, and if e
does not have that value, the implementation will do nothing. Each
command starts off the same way, by setting c
to 1 if e
has a
given value N and 0 otherwise:

d = i / i
(nowd
is 1)  Do
e = e  d
N times (nowe
is 0 iffe
= N) 
c = e + e
(nowc
is 0 iffe
= N and even) 
c = c + c
(nowc
is 0 iffe
= N and a multiple of 4) 
c = c + d
(nowc
is 1 iffe
= N, else ≥5 or ≤3) 
c = d / c
(nowc
is 1 iffe
= N, else 0)  Do
e = e + d
N times (restoring it to its original value)
Next, we determine if the condition matches. If the condition is "run
unconditionally", we have nothing to do. Otherwise, we want to set
c
to 0 (regardless of its previous value) if the counter in question
has a value greater than 1. The below is shown for counter A, but you
can consistently replace a
with b
or i
to get the implementation
of the other two nontrivial conditions:

d = d / a
(nowd
is 1 ifa
= 1, otherwise 0) 
c = c * d
(nowc
is 1 if bothe
= N anda
= 1, otherwise 0)
Most of the commands now have an obvious implementation. To increment
a counter (if the right command is running and the condition if any is
met), we add c
to a
, b
, or i
as appropriate. To decrement a
counter, we subtract c
from the relevant variable. To do a goto
command, we repeatedly do e = e  c
or e = e + c
to adjust the
instruction pointer by the required offset. None of these commands
will do anything if c
is 0, of course. (Note that for goto, we jump
to the noop just before the target command, so that we never end up
with two commands running in the same program iteration; this isn't
technically necessary but is the simplest implementation to prove
correct.)
To implement the fairly complex conditional halt, we do this:

d = i / i
(nowd
is 1) 
d = d  c
(nowd
is 0 if a halt is required, 1 otherwise) 
i = i  a
(nowi
is the value we'd want to output) 
d = d / d
(divisionby0 if a halt is required, otherwise noop) 
i = i + a
(restores the value ofi
)
Note that the complexity is required so that we can output the entire range of integers, not just the positive integers, as required by the problem specification and the choice of the integers as the output set.
At the end of the Blindfolded Arithmetic program, we need to move onto the next Minsky Machine command for the next iteration. That looks like this:

d = i / i
(nowd
is 1) 
e = e + d
Note that we're only adding 1 at a time to e
, in case we just ran a
goto instruction. This means that while we aren't doing gotos, only
every second iteration will do anything interesting. That isn't a
problem; the next command in the program will still run eventually,
and Turingcompleteness proofs don't care about performance.
Babbage's analytical engine
Sean Barrett described a computational model very similar to Blindfolded Arithmetic in 1992. I call this Babbage's analytic engine. He implemented a particular program in C to show off this computational model. His program uses machine integers instead of arbitrary size integers, the halting condition differs, and the second operand may be a literal. His program is also restricted to doing only instructions where the output operand is the same register as the first input operand. This restriction is significant only if you want to do instructions like v_0 = v_1  v_0
or v_0 = v_1 / v_0
where the two registers differ, and even those can be simulated with a clobbered spare register. The hint file claims that Babbage's actual analytic machine wouldn't have this restriction, it would allow operations on three arbitrary registers. Finally, the program has assignment instructions like v_0 = v_1
, but these could be readily written as v_0 = v_0  v_0; v_0 = v_0 + v_1
.