Branchless Algorithms
Branchless programming is the process of removing branches from code using only mathematical operators. This page will list multiple branchless algorithms as well as the operators needed for each. Normally branchless programming is used to speed up something by avoiding branching, but this page is not focused on performance. All code on this page will be written in Python.
All of these algorithms are total and so forth can be used in complex enough total programming languages. Most of these examples will consist of purely integer programming as it is more esoterically friendly, however some examples can use array and string operations if necessary.
Loops in Branchless Programming
In branchless programming, loops can often be "unrolled" that is to say, replaced with equivalent arithmetic expressions. This eliminates the branching part of looping constructs. Infinite loops can never be replaced with a total function but if you know the upper bound of a loop, and it is finite then it can always be replaced by a total function. For some of these longer algorithms a list of operators used will be shown.
Simple Example
Consider a basic loop:
a = 0 b = 10 for x in range(5): a += 2 b -= 3
Instead of iterating, we can compute the total change directly:
a = 0 b = 10 a += 2 * 5 <!-- 2 added 5 times --> b -= 3 * 5 <!-- 3 subtracted 5 times -->
This produces the same result without a loop.
Cumulative Example
Branchless programming becomes more powerful when the loop contains cumulative operations:
# Original loop a = 0 b = 1 for x in range(1, 6): a += x # cumulative sum b *= 2 # multiply by 2 each iteration
Unrolled using branchless arithmetic:
a = 0 b = 1 a += 5 * (5 + 1) // 2 # sum of first 5 numbers b *= 2 ** 5 # multiplying by 2 five times
This replaces the loop entirely with mathematical formulas, making the code fully branchless.
Summary
Loop unrolling in branchless programming works by analyzing the cumulative effect of each iteration and expressing it as arithmetic expressions. This technique can be applied to addition, subtraction, multiplication, powers, and more complex cumulative operations.
Algorithms
Boolean functions
This will show you how to do Boolean functions, to replace branching conditionals with branchless ones.
NOT (Negation)
Not !
If A is either 0 or 1, this operation flips its value: 0 becomes 1, and 1 becomes 0.
A = 1 - A # Flips the value of A
IF
If statements in branchless programming languages are best depicted as being like ternary operators.
variable = condition ? value_if_true : value_if_false;
Where the "value_if_false" is replaced with the original value of the variable so there is no change when the condition is ran.
Now you can't use that in branchless programming because that is using conditionals which is branching. So you can convert it into this instead
variable = ((condition==1)*(value_if_true))+((condition!=1)*(value_if_false))
AND
This will show two different versions of the AND &&
operator. One used for branchless conditionals and one used for simply getting the AND of a 2 Boolean integer values.
For the if statement we will use condition_a
and condition_b
it looks like
variable = (((condition_a+condition_b)==2)*(value_if_true))+(((condition_a+condition_b)!=2)*(value_if_false))
Now to simply get the AND value of 2 conditions you can just simplify it down to.
AND = ((condition_a+condition_b)==2)
OR
Now to finally demonstrate OR ||
it will follow the same format as showcasing AND.
The OR branchless conditional statement looks like this
variable = (((condition_a+condition_b)!=0)*(value_if_true))+(((condition_a+condition_b)==0)*(value_if_false))
Now to simply get the OR value of 2 conditions you can preform this.
OR = ((condition_a+condition_b)!=0)
Replacement of the ==
operator
Well what if you don't have ==
or !=
. That is fine because you can recreate the == operator using simple math operators. The operators needed are *
-
//
+
.
The algorithm goes like this
diff = a - b is_equal = 1 - (2 * diff * diff) // (diff * diff + 1)
and to get !=
you can simply drop the 1 -
at the start.
diff = a - b is_not_equal = (2 * diff * diff) // (diff * diff + 1)
Print integer as a series of ASCII numbers
+ | - | * | / | // | % | & | | | ^ | ~ | << | >> | && | || | ! | == | != | < | > | <= | >= | min | max | abs |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
This program is useful in programming languages that allow you to print integer values as ASCII characters but does not directly allow for integer printing. This program can print up to 5 digits of a number. N is the number being printed
n = 1542 # precompute divisions n1 = n // 10 n2 = n1 // 10 n3 = n2 // 10 n4 = n3 // 10 # digits to ASCII a = (n4 % 10) + 48 b = (n3 % 10) + 48 c = (n2 % 10) + 48 d = (n1 % 10) + 48 e = (n % 10) + 48 # length detection L = (n != 0) + (n1 != 0) + (n2 != 0) + (n3 != 0) + (n4 != 0) # mask leading zeros a = a*((L == 5)) b = b*((L == 5) + (L == 4)) c = c*((L == 5) + (L == 4) + (L == 3)) d = d*((L == 5) + (L == 4) + (L == 3) + (L == 2)) print(chr(a) + chr(b) + chr(c) + chr(d) + chr(e))
Another form of branchless programming (Base Manipulation)
Another form of branchless programming is direct manipulation of values in order to gain an output. This is most easily done in binary but isn't restricted to any base. This form of branchless programming is called Base Manipulation.
For binary we can use a new assembly instructions CPB
(Copy Bit.) and STB
(Set Bit.)
Copy bit copies a bit from one memory address and puts it into another bit in memory somewhere else.
Set bit takes a input or a constant value in the form of a single bit and puts it into memory.
ALU instructions are also allowed.
Another attempt to explain this form of branchless programming can be seen at the page Bunnicula.
AND
This will show you how to make an AND instruction with a branchless architecture.
# Vars A = Input A (0 or 1) B = Input B (0 or 1) # Memory C: 0000 D: 0000 STB C 0 A // Set LSB (ADDR A0) in A to C STB C 1 B // Set ADDR A1 to C INC C // Increment A CPB C 3 D 0 // Copy 3rd bit in A and store it into the LSB of B
A | B | STB C 0 A | STB C 1 B | INC C | CPB C 3 D 0 |
---|---|---|---|---|---|
0 | 0 | C:0000 D:0000 |
C:0000 D:0000 |
C:0000 D:0000 |
C:0000 D:0000
|
0 | 1 | C:0000 D:0000 |
C:0010 D:0000 |
C:0010 D:0000 |
C:0010 D:0000
|
1 | 0 | C:0001 D:0000 |
C:0001 D:0000 |
C:0011 D:0000 |
C:0011 D:0000
|
1 | 1 | C:0001 D:0000 |
C:0011 D:0000 |
C:0100 D:0000 |
C:0100 D:0001 <
|
This works because when you store A and B in two bits of C and then increment C, a carry occurs into the next bit only if both A and B are 1. That carry is then copied to D using CPB, which produces the same result as an AND gate. It is a branchless way to compute A AND B
using simple bit manipulation. Now, this programming heuristic can be extended to more complex total functions. However extending it to more complex problems requires either a smart devoted esoteric programmer, a detailed guide of tips and tricks (this has not been written yet), or a evolutionary/searching program to search for branchless ways to compute functions (easiest option).
Multiplying trick
Okay cool, lets say you have 0001
or alternatively 0000
in memory now. That represents 0 and 1 in base 10. As you know
0*x = 0 1*x = x
So you can use that like a ternary operation where 1 represents true and 0 represents false. This is one way to branchlessly set values based on a condition. This is most similar in terms of the originally stated form of branchless programming to this example below.
A = 5 B = 6 C = A == B Value = 5 D = C * Value print(D) # If A = B then this will be 5, otherwise it is 0.