~-~!
Paradigm(s) | declarative |
---|---|
Designed by | user:jfb and user:tb10 |
Appeared in | 2014 |
Memory system | numerical |
Dimensions | one-dimensional |
Computational class | Turing complete |
Major implementations | None so far |
File extension(s) | .ncmnt |
~-~!, or No Comment, is an esoteric programming language created by user:jfb and user:tb10 in January, 2014 based on the philosophy that everything is a number.
Numbers are tildes, -
is subtraction, and !
is comment syntax, so ~-~
is 0 and therefore the language is called 'No Comment'. Files containing ~-~! scripts have the extension .ncmnt
.
Arithmetic
Tildes (~) are numbers - ~ is one, ~~ is two, etc. + adds, - subtracts, / divides (integer division: truncates toward 0), ; is modulo (remainder on division) , is multiply. Negative numbers must be written using a subtraction, e.g. ~-~~ is negative one. Zero is %. This is library syntax - explained later. Division by % isn't an error but rather is infinity, notated by 8. %,8 is defined to be %. All the operators are left associative. Precedence: AMDAS - Angle brackets <...> - everything inside is evaluated first, Multiplication and Division (and modulo), Addition and Subtraction. Similar to PEMDAS except we don't use parenthesis and there's no exponentiation. Numbers are unbounded - implementations must provide arbitrary-precision bignums, as in Scheme.
Comments
(code) !|this is ignored|
is a comment - the code is evaluated and returned, and the bit in piped is completely ignored. Additionally, the first line is ignored in a file if it begins with #!, this allows ~-~! scripts to be run from the command line. The part after the ! can actually be any value, |...| is just used normally because of its meaning which is explained later. Comment is actually it's own operator with a the highest precedence below brackets.
Comparison
==
is comparison: returns ~(1) if they are equal, %(0) if they aren't. === is a synonym for ==, so is ====, and =====, and so on. == has lower precedence than add and subtract.
We should note that as everything is a number, these symbols when used in a number-requiring context are their own ASCII values, with any number of ==s being the ascii value of '='.
Variables
A variable is a number of apostrophes ('), such that ' and '' and ''' are separate variables. = assigns: '=~ sets ' to one. Almost ANYTHING can be assigned - even ~ = ~~ is legal - from that point on, all tildes will have the effect that two tildes would have had before. Unless some variable had an odd value, you have no way to get back odd numbers.
= = +
will change the assignment operator to addition, and comparison to ++. Of course then you can't assign anything else! ' = = puts you in DEEP trouble - it will turn any variable from that point on into a comparison with no way to get it back! Statements should be separated by :. When there is a comment, put the : AFTER it. The precedence of = is lower than that of comparison.
: is like the lowest precedence operator which evaluates and ignores the result of it's first operand and returns it's second. This is like ; in other languages, as it separates statements. However, it can't be at the end of a program, because then it would only have one operand which is an error.
IO
@(something)
prints the unicode (UTF8) character corresponding to the input number to the output stream. If it is 8 (infinity, not ~~~~,~~), the program hangs and must be killed with ctrl+c. If negative or not a valid unicode character, the program will wipe your hard drive signal an error. If it is an incomplete character, it is buffered until completed. Precedence is higher than multiply.
^
will input a unicode character from the input stream. On end of file, ^ returns -1 (%-~) It is a special symbol that won't (always) return it's own ascii value, nevertheless it can still be assigned.
We now know enough to write a hello world program:
#!/usr/bin/env ncmnt ~!|An interpreter does not exist yet, but this script assumes it's filename will be ncmnt|: ?|..| !|Import the standard library - explained later|: @~~,~~~~,~~~,~~~ !|2*4*3*3 = 72 = H|: @~~~~~~~~~~,~~~~~~~~~~+~ !|101 = e|: @~~~~,~~~,~~~,~~~ !|108 = l|: @~~~~,~~~,~~~,~~~: @~+~~,~~~~~+~~~~,~~~~~,~~~~~ !|111 = o|: @~~~~,~~~~~~~~~~~ !|44 = ,|: @~~~~,~~~~~~~~ !|32 = spaaace|: @~,~~,~~~,~~~~,~~~~~-~ !|5!-1 = 119 = w|: @~+~~,~~~~~+~~~~,~~~~~,~~~~~ !|o again|: @~~~~~,<~+~~,~~~~~~~~~~~>-~ !|114 = r|: @~~~~,~~~,~~~,~~~ !|l again|: @~~~~~~~~~~,~~~~~~~~~~ !|100 = d|: @~~~,~~~~~~~~~~ !|33 = !|: @~~~~~~~~~~ !|10 = newline|
Conditionals
(condition)[(then)](else)
is a special form which has the syntax (condition)[(then)](else). If the condition is ~, the (then) part is evaluated and returned, if it is % the (else) part is. It associates left to right; that is ...[...]...[...]... acts like a multi-way conditional, similar to lisp's and scheme's COND. If the condition is anything other than ~ or %, the interpreter will signal an error. Unlike other symbols, [...] is a true special form and can't be assigned. Same with <...>.
Strings
|...|
is a special form that takes the string between the pipes and interprets it's binary representation as a number (EVERYTHING is a number!), that is, the bytes strung together. For example |hi| becomes hex 6869 = decimal 26729. Integers are unbounded so they can be as high as you want, therefore strings can be as long as you want The UTF-8 character encoding should be used. Pipes can't nest, so there's no shortcut to getting the | character except by calculating 124. With one character it is very useful as it allows the hello world program to be greatly simplified:
#!/usr/bin/env ncmnt @|H|:@|e|:@|l|:@|l|:@|o|:@|,|@| |:@|w|:@|o|:@|l|:@|d|:@|!|:
Functions
The & operator is for function application. It's left operand will be interpreted as a string (in much the reverse manner as strings are interpreted as numbers above), and interprets it as a ncmnt program, which is executed. That sub-program can access (and modify) all the same variables as the main one, also it has access to an additional symbol * whose value is the right operand to &. The precedence of & is higher than @ (which is a unary operator) and associates to the left. The value of & is the value of the last statement in the subprogram, which should NOT have a : after it.This allows us to write functions with recursion, for example:
'=|*==%[%]~~+'&<*-~>|
allows ' to be used as a multiply-by-two function, in that '&~~~ is equivalent to ~~~~~~. I think , (multiply) can be written as a library form this way, but I'm not sure because it might require string concatenation (defined shortly) which relies on multiplication.
We can write a simple cat program with functions:
#!/usr/bin/env ncmnt ?|..|: '=|''=^: ''==%-~[%]<@'': '&%>: '
Two or more argument functions are achieved using currying, that way ~-~! is highly similar to other 'pure' functional languages. The pipe form can be viewed as constructing the Gödel Numbering for a program, and the & operator as excecuting it. Here is a string concatenation function because a lot of curried functions are likely to need it:
' = |*==%[%]~+'&<*/~~~~~~~~>| !|number of bytes in a number (log256)|: '' = |*==%[~]~~,~~,~~,~~,~~,~~,~~,<''&<*-~>>| !|256^x|: ''' = ~~~~,<~+~~~~~+~~~~~~> !|pipe character|: '''' = ''&~, |''&<'&*> , <| + ''' !|auxiliary string used in definition of cat|: ''''' = ''&~~~ ,''' + |>+*| !|another aux|: '''''' = |''& <'&'''' + '&*> , '''' +''& <'&''''' > , * + '''''| !|string concatenation|:
With those definitions, '''''&|a|&|b| should equal |ab|, but it's untested due to the lack of an interpreter. This function will fail however if it's first argument contains the pipe character, it is possible to design a better algorithm that escapes it's argument, I might define one later when I have time.
Libraries
A library is stored in a file that starts with 'lib'. To import a library from a program, the ? operator is used. The standard library is called .., so to import the standard library you write
?|..|:
This will look for a file named lib...ncmnt and load it. If a library could not be found, it is an error. A library can export symbols using the $ operator. Its left operand must be a string telling it what to export, and the right a value. The first (UTF8) character of the string is the symbol to export. If it is the only character, it will be exported as an ordinary value. This is how % and 8 are exported from the standard library: in fact, it is simply:
|%|$~-~: |8|$~/<~-~>:
More things might be added to the standard library later. It is an error to export a digit as it's logical value: |1|$~ will be an error. This is why %, not 0, represents zero. It is also an error to export the symbol ∞ as infinity so 8 is used instead.
The $ operator will return the value that was exported.
If the second character of the string is <, it will be exported as a unary prefix operator. If it is >, it is a unary postfix operator. For example, if the following code was in a library:
|l<|$|*+~|: |r>|$|*+~|:
that was loaded, then the following expressions are equivalent:
l~ ~r |*+~|&~ ~+~ ~~
If the second character in an export string is | (which is tricky because you have to calculate it, you can't just quote it), the exported function will be a binary operator, which is left associative (as all binary operators are). It is treated as a curried function, which takes the left argument and returns a function which will take the right argument and return the desired result. So if b is exported as a binary operator, then |a| b |c| is equivalent to (b) & |a| & |b| where (b) is the value that b was exported with.
The rest of the string is interpreted as a number again, denoting the precedence. (for both unary AND binary operators). The precedences of the built-in operators are listed below:
- + and - have a precedence of %-~
- , and / and ; have precedence %
- & has precedence ~~
- !, ?, and $ have precedence 8 (infinity)
- @ and # have precedence ~
- == has precedence %-~~ (Special because it's the only operator with 2 characters)
- = has precedence %-~~~
- : has precedence %-8 (-infinity)
- ...[...]... (Ternary operator) has precedence %-~~~~,~~~~,~~ (-256)
The symbols exported by $ are added to a table, and after the library has finished loading, the symbols saved on the table to be exported are assigned their proper values.
One of the problems with exporting a function from a library is that it will reference variables which may conflict with those in the main program. The solution to this is to use #, the library index operator. It is a unary postfix operator. Every library needs to have different variables and symbols imported from other libraries, so whenever (something)# is run, the library index (a special, spherical variable) is set to the argument and its previous value is returned. Then the set of ' variables that were assigned last time the library index was this value is loaded, or are all empty if the library index has never been this value before. Every time a library is loaded by ?, including recursive loads, the library index is assigned to a new value it hasn't been before, thus clearing all variables (but storing them under the old library index) and the library is loaded with this new library index. When the library finishes loading, it is set to the value it was before the library was loaded and the symbols that were exported are added to the symbol table of this old library index, allowing them to be used in code that imported the library.
When a program starts, the library index is %.
The library index is set as soon as the # function returns, so '=~# would set the ' variable within library index 1 to whatever the old library index was before it was set to 1.
It is then the responsibility of the library to set the index to what it was when the library was being loaded whenever an exported function is called that relies on variables defined in the library. This could be done, for example, by wrapping it in | characters, and concatenating that with the # operator to set it.
Computational class
~-~! is Turing complete. Any Turing machine can be transformed into an equivalent ~-~! program. A Turing machine has several things: an alphabet - set of symbols to use, a tape - infinite in one or both directions, a set of states, and a transition function. Let's set this up:
' = ~~ !|number of symbols - adjust as necessary. It can always be 2 but can be higher if you want.|: '' = %: ''' = % !|tape - '' has the head and everything to the left of it, while ''' has everything to the right of the head.|:
As numbers are unbounded, the tape is infinite in both directions. Now, the transition function defines what each state does. In each state the machine can do several things: it can
- read a symbol
- write a symbol
- move the tape left or right (or stay where it is)
- go to a new state
- halt
Let's write some functions to set this up:
'''' = |'';'| !|read|: ''''' = |'' = <''/','+*>| !|write|: '''''' = |''' = <''','+'';'>: '' = ''/'| !|move tape right|: ''''''' = |'' = <'','+''';'>: '''='''/'| !|move tape left|:
Now, for each state in the turing machine, we can write a function with the template:
'''''''(state:...) = | ''''&% == %['''''&(write0:...):''''''(move0:('))&%:'''''''(new0:...)&%] ''''&% == ~['''''&(write1:...):''''''(move1:('))&%:'''''''(new1:...)&%] ... %| !|continue for however many symbols there are|:
For the (state:...) template, fill in the number of apostrophes that corresponds to the state number. For each (writen:...) template, fill in the number of tildes that corresponds to the symbol to write in . For each (moven:('))... template, put one more apostrophe to move left or leave as is to move right (or omit that section entirely to stay stationary). Finally, for each (newn:...) template, fill in the number of apostrophes corresponding to the new state number, or omit that section to halt.
It's Turing complete!
Implementation
There isn't one yet, but we plan to write one soon once we've worked out all the details.