Higher Subleq
Higher Subleq (HSQ) is a typeless simplified C-like language by Oleg which compiles into Subleq.
Basics
Unlike C, Higher Subleq does not have preprocessor, does not support structs, bitfields, abstract declarations, and bit operations. It has only one underlying type int, but keywords int, char, void, and their pointer derivatives can be used as synonyms. A few additional features are taken from C++:
- local variables can be declared at any point inside the function, not only at the beginning of the block
- comments // are allowed
- any name used must be declared, including library functions
- for accepts C++ style declaration inside: for( int i=0; i<3; i++ )
The following C library functions are implemented:
- int putchar(int);
- int getchar();
- int puts(int);
- int printf(char*,...); // supporting only %%, %d, %c, and %s
The memory is organized in the following order: 1) User code; 2) Library code (e.g. printf) which is added automatically; 3) Internal assembly variables; 4) Stack. Stack can grow infinitely, since the memory is unbound from one side, which allows recursion. For example, below is a program calculating factorial:
int printf(char * s); int fact(int a) { if( a<2 ) return 1; return a*fact(a-1); } int main() { printf( "%d", fact(12) ); }
Multiplication, division, and modular division are implemented with O(log(n)) algorithm. Since it is expensive operation the code is replaced internally with a function call to a library function.
All library functions are written in Higher Subleq and embedded in the compiler. They are appended (if necessary) and compiled along with the user code.
Compilation
Consider a simple program
int main(){}
This program compiles into Subleq assembly
top:top top sqmain _main: dec sp; ?+11; sp ?+7; ?+6; sp ?+2; 0 ?+6; sp ?+2; bp 0 bp; sp bp sp; bp sp ?+8; sp ?+4; bp; 0 bp; inc sp ?+8; sp ?+4; ?+7; 0 ?+3; Z Z 0 sqmain: dec sp; ?+11; sp ?+7; ?+6; sp ?+2; 0 ?+6; sp ?+2; ?+2 0 ?+1; . ?+3; Z Z _main; inc sp Z Z (-1) . inc:-1 Z:0 dec:1 ax:0 bp:0 sp:-sp
The first line jumps the execution to sqmain. The purpose of this section is to call function main, and then halt (Z Z (-1)). The function main must be a real function because it is legal to call main inside the program. The return address is pushed into the stack, then execution jumps to the label _main. Inside the function stack pointer sp pushed into the stack, base pointer bp is set to sp, then the body of the function goes, which is empty and represented by an empty line. The last three lines in the function do 1) restore stack pointer; 2) restore bp from the stack; and 3) pop the address from the stack and jump. The last line defines internal variables which are used by the compiler for in-house work.
In another program
int a=1; int main() { int i; ++a; }
a few additional statements will change the result into
top:top top sqmain . _a:1 _main: dec sp; ?+11; sp ?+7; ?+6; sp ?+2; 0 ?+6; sp ?+2; bp 0 bp; sp bp dec sp inc _a sp; bp sp ?+8; sp ?+4; bp; 0 bp; inc sp ?+8; sp ?+4; ?+7; 0 ?+3; Z Z 0 sqmain: dec sp; ?+11; sp ?+7; ?+6; sp ?+2; 0 ?+6; sp ?+2; ?+2 0 ?+1; . ?+3; Z Z _main; inc sp Z Z (-1) . inc:-1 Z:0 dec:1 ax:0 bp:0 sp:-sp
Three additional lines appear: declaration of a global variable _a; reservation in the stack a memory cell for _i; and increment _a.
Some features
Declarations
int i=+2, j=&j-&i+1; int a='a', b="abc"; int i[2], j['F'-10*2]; int k[+2]; int i[] = "abc"; char *p = "abc";
goto
int k; void main() { goto a2; a1: __out 'a'; return; a2: k = a1; __out 'b'; goto k; }
Conditional
int *p,a,b,k; int putchar(); int f() { putchar(a); putchar(b); } int main() { k = 1; *( k? &a: &b) = 'a'; *(!k? &a: &b) = 'b'; f(); k ? putchar('c') : putchar('d'); !k? putchar('e') : putchar('f'); a = k ? 'g' : 'h'; b = !k? 'i' : 'j'; f(); ( k? a : b ) = 'k'; ( !k? a : b ) = 'l'; f(); p = &( k? a : b ); putchar(*p); p = &( !k? a : b ); putchar(p[0]); int i; (k? a : i ) = 'm'; (!k? a : i ) = 'n'; b = i; f(); }
Pointers
void putchar(int c); int r; void main() { int q,p="\0\nHello, World!\n"; (++p)++; while(*p) putchar(*p++); p--; r = q = p; while( *--p ) putchar(*p); while( *--r ) putchar(*r); q--; while( *q ) putchar(*q--); }
int k, p, q; void main() { __out ++(k=49); p = &k; ++*p; __out k; ++++k; __out k; q = &p; __out ++**q; ++ +k; __out k; -- +*p; __out k; }
int a[10], p; void main() { a[2] = &*50; p = &*&*&a[2]; __out **&p; }
void main() { __out *("xyz"+2); __out "abc"[1]; }
Argument list
int f(int a, int b, int c) { __out a; __out b; __out c; } int a=30; int g() { a = a+1; return a; } void main() { f(g(g())+g(g()),g(g())+g(g()),g(g())+g(g())); }
extern
extern int k; void main() { int x; { __out k; k = 51; int k=50; __out k; } __out k; } int k=49;
Function calls
void g(); void f(); int k; void main() { k=f; k(); k=g; k(); } void g(){__out 49;} void f(){__out 50;}
Labels
void main() { aaa:; { bbb:ccc:; bb2:{cc2:;} } }
Logical operations
int x=0, y=1; void main() { if( y && x ) __out 48; if( y && y && y ) __out 49; if( y || x ) __out 50; if( x || y || x ) __out 51; while(x); while(x<0); for(;x;); for(;x<0;); if(x) __out 52; if(x<0) __out 53; if(y) __out 54; if( !x ) __out 55; if( !(x<0) ) __out 56; if( !(x==0) ) __out 57; if( !(x&&x) ) __out 48; if( !(x||x) ) __out 49; }
Escape sequences
int a="abc", b[]="\'\"\?\\\a\b\f\n\r\t\v"; int c[]="\xax\x011\78\0171\0";
Keywords
extern int i; void main() { for(;;){ break; } while(0); goto lab; lab: for( i=50; i<55; i++ ) { if( i>52 ) break; else continue; i = __in; } __out i; return; } int i;
Operator __in inputs a character from the standard input and __out outputs (See Subleq for IO operations).
Examples
Hello, world!
This program prints "Hello, World!" 4 times
void putchar(int a); int printf(char * a); int puts(char * a); char *a = "Hello, World!\n"; void main() { puts(a); printf(a); for( int i=0; a[i]; i++ ) putchar(a[i]); while(*a) putchar(*a++); }
Library functions
printf function can be defined as
int puts(int s); int _putint(int i); int printf(int s) { for( int i=0; *s; s++ ) { if( *s != '%' ){ __out *s; continue; } ++s; if( *s == '%' ) __out '%'; else if( *s == 'c' ) __out *(&s-++i); else if( *s == 's' ) puts(*(&s-++i)); else if( *s == 'd' || *s == 'i' ) _putint(*(&s-++i)); } } int main() { printf("%% %c %s %d\n",'A',"hi",123); }
__out operator is a low level Suleq instruction to output a memory cell as ASCII character.
puts is simple
int puts(int s) { int p=s; while(*s) __out *s++; return s-p; };
_putint must be defined using recursive calls and division, for example
int _putintr(int x) { if( x>0 ) { int i,j; i = _divMod(x,10,&j); i = _putintr(i); __out 48+j; return i+1; } return 0; } int _putint(int x) { if( x<0 ) { __out 45; return _putintr(-x)+1; } if( x>0 ) return _putintr(x); __out 48; return 1; }
And the division can be implemented simply as
int _divMod(int a, int b, int j) { if( a<b ) { *j=a; return 0; } int b1=b, i=1, bp, ip; next: bp = b1; ip = i; b1 = b1 + b1; i=i + i; if( a<b1 ) return ip + _divMod(a-bp,b,j); goto next; }
Compiler
The compiler is written in standard C++. It does not depend on any ancillary libraries or external tools. Download hsq.cpp (archive).