Parse this sic
- This is still a work in progress. It may be changed in the future.
Parse this sic is an esoteric programming language invented by User:Digital Hunter. The name has a cute double meaning, and also serves to continue the pattern of languages I've created that end with "-ic".
If there's anything that needs clarification please bring it up on the talk page. If there are any glaring errors use your best judgement on how to clear them up.
Parse this sic, or PTS, is a stack-based language with self-modifying code, built on the premise of the "word" datatype. Each word is essentially a string (but calling them strings is cursed and illegal) with an attached numeric value, and from this point forth words will be described with double quotes (as any string would). Anything that evaluates to a word can be passed to a parenthetical expression, and is known as a label. How exactly words' values are determined will be described in detail later. The null word is a special word that looks like an empty string, and is the shortest word with a value of zero.
Whitespace is never ignored, and words can and probably will contain interesting whitespace characters that an implementation must handle neatly. Additionally, whitespace contributes to the character count of a program which will matter to you in a few minutes if you're reading this for the first time.
Certain expressions will evaluate to a number, which is the natural result of performing arithmetic. The program itself will keep track of the correct value with a word that suffices, and the programmer can assume that any given number will always be mapped to the same representative word. A good implementation would store all numbers as the shortest word with the correct value. A list of the shortest words with specific values will eventually be provided. To either eliminate or espouse your confusion, this page will be written with normal base-10 numerals unless otherwise specified, and with the hope that you don't think too hard about the fact that every base is base-10.
Every character in your source code is given an index, with the first given index 1 and the last given index 0. Indices increase from left to right by one per character, and wrap around the edges. For example, in
= at the very end has indices 0, 9, -9, 18, -18, and I think you can see the pattern. Yes, it does get annoying to keep track of indices in long programs where the code that needs indices changes the length of all the code and you have a lot of things all depending on each other. No, nobody's doing anything about it.
Character indices wrap because code execution wraps around the edges. A program without a proper terminating case will thus repeat indefinitely.
Commands and keywords
Commands are commands, and keywords are special words that change the effects of a three-parameter parenthetical. This is a basic overview of their properties.
|Command or keyword||Description|
||Pops a word off the stack and jumps to the character in the program matching the index of the word's value. Technically evaluates to that word but this is nearly unusable.|
||Flips the code reading direction between LTR (default) and RTL. Also flips the indices of every character. Technically evaluates to the null word but this is nearly unusable.|
||Pops a word off the stack and evaluates to it as a label if inside a parenthetical, outputs the word otherwise.|
||If the code direction is LTR and the value of the word on top of the stack is less than 1, skip to past the next |
If the code direction is RTL and the value of the word on top of the stack is greater than 0, skip to past the next
Technically evaluates to the word peeked at but this is nearly unusable.
||Tells the program to stop reading a word. Creates a label to that word to which the whole bit (as in bunch) evaluates.|
||Skips to past the next |
||Kills program execution.|
||Parenthetical expressions come in four flavours: zero-parameter, one-paremeter, two-parameter, and three-parameter. They will be described more clearly later.|
||When the first parameter in a three-param evaluates to "ameliorate" exactly, the parenthetical evaluates to a word with the value of the sum of the other two parameters.|
||Same deal as above; evaluates to the third subtracted from the second.|
||Same deal as above; evaluates to the product of the other two.|
||Same deal as above; evaluates to the integer quotient where the second is the dividend and the third is the divisor. If the third has a value of zero, evaluates to the null word.|
||Same deal as above; the second parameter is interpreted as a PTS program operating on the same stack as the main program, but with independent indices. All zero-params immediately evaluate to the third parameter. The whole expression evaluates to the concatenation of everything output from the mini-program through |
||Same deal as above; the first (determined by code direction) instance after the closing parenthesis of this expression of a substring of the code that matches the second parameter exactly will be replaced with the third parameter, with the search wrapping around the edges. If no match is found, evaluate to the null word. If a replacement is made, evaluate to the word that was replaced (the second parameter).|
||Same deal as above; evaluates to the second parameter if the second and third parameter match. Evaluates to the null word otherwise.|
A neat 7-command 7-keyword situation.
More on words
Words are neat because they serve the function of both a string and a number. Note: I put number in italics when referring to numbers, those interesting technical words that result from an arithmetic three-param.
When the time comes to determine the value of a word, its equivalent string is searched through character-by-character from the beginning until a valid digit is encountred. A digit qualifies as any
A..Z character (case-sensitive), and the first digit found will determine the base in which the rest of the word is read as a number -- 0 indicates unary, 1 indicates binary, Z indicates uhh base-36. Every nondigit will be ignored when calculating the value of the word from then on.
For example, the word
"140f9ai392(324" has a value equal to 2 (with or without the quotes!), because the word is read as the binary number 10. A nice consequence of PTS's number system is that both "0" and "1" have the same value.
In code, words are generally first created with a word literal. When the program is run, if a non-command character is encountred it is treated as the first character in a word and word-reading mode is activated. Every character read from then on is added to the word, until
& is found. This ends the word reading and creates a label to the word that was just read, which can then be passed to a parenthetical or just discarded (but why would you ever want to do that? So cruel). For example,
will act as a label to the word
"140f9ai392(324" that is destined to do many great things. Do ignore the fact that this label is immediately discarded. Other than wasting memory, however, floating labels are helpful for word values that may be used later, or for placeholders with the
"succeed" functionality. Additionally, marking comments and whitespace for readability with
& will likely be used in larger programs so an implementation should be able to handle it effectively.
At long last.
First comes the zero-param.
() is the sole example. When encountred in a program ordinarily, the user will be prompted for an input, which is automatically cast as a word. The zero-param will then evaluate to that word if passed to an outside parenthetical. However, in a
"walking" mini-program, zero-params will not prompt for input but instead evaluate to the third parameter from the outside program as described earlier.
Next comes the one-param.
(one-param&) is a valid one-param, with the word
"one-param" as its one parameter. Anything that evaluates to a word can be passed to a one-param. A one-param's sole purpose is to push its parameter onto the stack, but it also evaluates to that word and can be used to do many things. It can even be passed to itself, like
((darkness&)), and push the same word twice!
Third comes the two-param.
(1f1sh&2f0sh&) is a valid two-param, where
"2f0sh" are its parameters. A two-param sees the entire program as one long (and technically infinite) word, and evaluates to the word that is the substring marked off with the indices determined by the values of its parameters, inclusive. This was incredibly wordy, so here's an example.
This code first creates a label to the word
"uncopyrightable" which is immediately discarded. Sorry. Anyway, the first two-param looks at character 3 and character 6 to evaluate to
"copy" and then also be immediately discarded. The next two-param evaluates to
"bath". The final two both look from the very first (value 1) to the very last (value 0 (the second
& labels the null word)) characters, and both evaluate to
"uncopyrightable&(3&6&)(D&A&)(1&&)(0&&)=" which is the entire program. This makes it very easy to create pretty lame quines, because the source code can always be directly accessed by PTS programs, but interesting/traditional quines can still be written with other means.
At even longer last, the three-param.
(perowanfe&word1&word2&) is a valid three-param. But wait,
"perowanfe" isn't one of the keywords listed with the handy acronym Astute Dinosaur Thinks Stars Want Some Dinner! This is fine, because the three-param combines the functionality of functions with the equivalent of variable assignment and also word concatenation. Three uses for the three-param.
Above were described the effects of three-params if their first parameter evaluates to one of the keywords. In the example
(perowanfe&word1&word2&), however, the second and third uses of the three-param are demostrated concurrently. From this point on in the program, anything that evaluates to
"perowanfe" such as
perowanfe& will instead evaluate to
"word1word2", hence the concatenative ability and pseudo-variable assignment. Some examples assuming these examples were already executed somewhere prior:
(perowanfe&conch&shell&) | Anything that evaluates to either "perowanfe" or "word1word2" now evaluates to "conchshell" | (word1word2&conch&shell&) | Same effect as above | (variable&New_Value&&) | Anything that evaluates to "variable" now points to "New_Value" |
Parenthetical expressions can (and will) be nested, and are evaluated in the order of code direction, deepest first; if the code is LTR then the program will "enter" any parentheticals it sees from the left, and go as deep as possible. This matters for, say, the effects of stack manipulation. Parentheses pair the same way regardless of code direction.
These examples will eventually be followed up with explanations and prefaced with the knowledge required to understand them and hopefully debug them.
The simplest one pushes the word "Hello, world!" to the stack to be outputted.
This one will prompt the user for word (basically string) input indefinitely.
Input is taken, pushed to the stack, and outputted, and program execution wraps around to do it all again.
Prints all ten verses of 9 Bottles of Beer with the typo "1 bottles of beer" because I'm lazy. A full proper program is under construction. Note the newlines and the double newline in the middle. Necessary for proper spacing of the verses and changing this aesthetically will also require label changes.
-(No&)*((3231&544&))*(No&)*((3231&EC&))*(! Go to the store, buy some more, &)*(9&)*((3231&550&))*(.&)*=+(&5)*(& .)*((&36&I))*/*(&oN)/(((&0*&etanimod)))*(& ,dnuora ti ssap ,nwod eno ekaT !)*((&X&I))*((*))*(& ,llaw eht no reeb fo selttob )*((*))/+(&)/(&9)
This particular program could be edited quite easily to almost be a proper 99, but runs into the particular problem that numbers' representations are unspecified. Even if they were specified, by the rules of PTS a decimal-biased song wouldn't be possible anyway.
Info to come
This wiki page is awaiting:
- That list of numbers I promised
- Summary of stack manipulation
- Summary of flow control
- In-depth discussion of the commands
- Better organisation
- Probably some more category links