~-Hash

From Esolang
Jump to: navigation, search
The title of this article is incorrect because of technical limitations. The correct title is ~#.

~# (said as tilde-hash) is a language modelled on Brainfuck, with some major modifications - most notably the addition of a memory cell within the pointer itself, which now has read/write capabilities. To keep the number of instructions down, I/O is mapped to r/w on cell 0. Programs must also be explicitly ended, else they loop infinitely.

Overview

~# consists of a right-infinite tape with every cell initialised to 0. There is a pointer with a single-cell memory initialised to 0, and the pointer is placed over cell 0, which handles I/O. Programs consist of the 12 characters ~#<>+-=[]{}!, but "+" and "-" are not valid instructions - they must either be repeated (as "++" and "--") or followed by an "=". "+"s, "-"s and "="s that are not part of such digraphs are ignored. If the program tries to step past the final instruction, the program repeats from the beginning.

Instruction set

~# has 13 instructions, including two sets of loop brackets, and four operations on the data stored in the pointer memory.

Instruction Description
~ Read the cell at the pointer to the memory
# Write the memory to the cell at the pointer
> Move the pointer one cell to the right
< Move the pointer one cell to the left
++ Increment the number in the memory
-- Decrement the number in the memory
+= Add the contents of the cell at the pointer to the memory
-= Subtract the contents of the cell at the pointer from the memory
[ Skip to the matching "]" if the cell at the pointer is zero
] Jump back to the matching "[" if the cell at the pointer is non-zero
{ Skip to the matching "}" if the memory is zero
} Jump back to the matching "{" if the memory is non-zero
 ! Terminates the program

Every other character is ignored, making text valid comments.

I/O

I/O in ~# is mapped to read/write on cell 0. Input can be as many characters long as you like, as when they are converted to numbers to be stored, all the values are added. Thus, entering "Hello World!" (minus the quote marks) inputs a value of 1085, which can be otherwise obtained from the single character н, as ~# uses Unicode instead of ASCII. Output is always a single Unicode character.

Sample Programs

Cat

A cat program in ~# is very simple - it's the program name itself.

Copying

Due to the memory in the pointer, copying is very simple to implement, unlike Brainfuck.

  ...~>#...

If this is the entire program, however, this will set the entire infinite tape to the value the user inputs.

Hello World!

"Hello World!" in ~# can be made fully loopless because of the += instruction, which can be used for doubling and hard-coded multiplications. Both applications are used extensively in this program - doubling in setting the Unicode values for the letters "H", "e", and the space, and hard-coded multiplication in adjusting the values to obtain the other letters.

 > ++ ++ ++ ++ # ; Place 4 in c1
 > # += ++ # += # += # += # -= ; Calculate 72 in c2
 > ++ ++ ++ # += # += # += ++ # += # += ++ # -= ; Calculate 101 in c3
 > ++ ++ ++ ++ # += # += # += # -= ; Calculate 32 in c4
 << ~ << # ; Write H
 >>> ~ <<< # ; Write e
 > += += -- < # # Write ll
 ++ ++ ++ # >>> # Write o and store
 > ~ <<<< # Write a space
 >> ~ < += += += += -- < # Write W
 >>> ~ <<< # Write o
 ++ ++ ++ # Write r
 > -= -- -- < # Write l
 > -= -= < # Write d
 >>>> ~ <<<< ++ # ; Write !
 ! ; End

Compiled code

Compiled ~# is composed of 8-bit Unicode characters with no obvious connection to the source code. This is due to the way that ~# compiles. Firstly, each instruction is converted into a binary number according to a coding I haven't worked out yet, which is designed to make the more frequent instructions shorter than less frequent ones. Then, the numbers are concatenated, then divided into bytes. If there aren't enough to make a full byte at the end of the program, trailing "0"s are added. Finally, each byte is converted into Unicode. This has the net effect that any given instruction is quite likely to be in multiple characters, or embedded in the middle of one.