Sclipting

Sclipting is a stack-based golf language, inspired by GolfScript, that uses Chinese characters for instructions and Hangul syllables for data (strings and integers). The basic idea is that to minimise the number of characters in a program, the language should provide as many single-character instructions as possible. It was invented by Timwi in 2011.

Sclipting is not considered finished as it can trivially be extended with more and more instructions assigned to new Chinese characters.

Execution
Sclipting is stack-based. Most instructions execute by popping a certain number of elements from the stack and pushing a result. Some instructions, however, operate on the top item without popping it. Yet other instructions may shuffle the stack around almost arbitrarily.

At the begining of the program, the input (e.g. STDIN) is placed on the stack as a single string.

At the end of execution, the 併 instruction is executed once, which generates a string from all or part of the stack contents. This string then forms the program’s output.

Byte-array literals
The most common literal in Sclipting is the byte array (also called a buffer). It is encoded in the source as a sequence of Hangeul syllables in the following way:


 * Encode three bytes at a time. Three bytes are 24 bits. Take the first 12 bits and add U+AC00 (that’s your first character), then the rest and add U+AC00 (that’s the second character). If the byte array length is divisible by 3, you are done.
 * If there is one byte left, encode it in a single character by shifting it left by 4 bits and adding U+AC00.
 * If there are two bytes left, generate the first character from the first 12 bits the same way as above, while the second character is U+BC00 plus the remaining 4 bits of the second byte.

Examples:

Number literals
Positive numbers are expressed as a (big-endian) byte array. This allows arbitrary-size integers.

7076 Hangeul syllables can be used to express a negative integer (U+BC00 = −1 to U+D7A3 = −7076). The range overlaps with that used by byte array literals, but there is no ambiguity because byte array literals always start with a character in the range U+AC00-U+BBFF. Thus, if the program contains a U+BC0x with no character in that range preceding it, it encodes a negative number.

If you need a negative number less than −7076, write it as a positive number and then negate it.

Data types
Sclipting knows the following data types:


 * Byte array: A sequence of bytes of a certain length.
 * String: A sequence of Unicode characters of a certain length.
 * Integer: An arbitrary-size signed integer.
 * Float: A double-precision (IEEE) floating-point number. Possible values include ±∞ and NaN.
 * List: A list of items of a certain length. Each item can be of any of these datatypes, including another list.
 * Mark: A special item (generated by 標) that marks a position in the stack (and occupies a stack slot of its own). Currently only 并 and 併 make use of this.
 * Function: An anonymous function (lambda) that can be pushed to and popped from the stack and executed.

Conversions
The following describes implicit conversions that are performed by instructions that expect a certain input data type. However, some instructions may override some of the behaviours described here.

Conversion to byte array
Items never convert implicitly to a byte array. Instructions that generate byte arrays specify in their documentation how they generate it.

Conversion to string

 * Byte arrays are decoded as UTF-8.
 * Integers/floats are rendered into strings by using the invariant culture.
 * Lists are converted to strings by converting each element to a string and concatenating them all. If the list contains further lists, this applies recursively.
 * Marks and functions convert to the empty string.

Conversion to integer

 * Byte arrays are assumed to be in big-endian order and converted to a positive integer as if it were a base-256 number.
 * Floats are rounded toward 0. The floats ±∞ and NaN are all converted to 0.
 * Strings are parsed as integers (in decimal) using the invariant culture. If the string is not a valid integer, it is converted to 0.
 * Lists are converted to integers by converting each element to an integer and then computing the sum. If the list contains further lists, this applies recursively. If any item in the structure is a float, the sum operation is floating-point and the resulting float is then converted to an integer.
 * Marks and functions are converted to 0.

Conversion to float
All items that aren’t already floats convert to a float in the same way that they convert to integers, except of course that the result of a list sum is not rounded.

Conversion to list
Items never convert implicitly to a list. Instructions that operate on lists or iterate over them specify in their documentation what they do when the input is not a list.

Many list operations convert any non-list to a string and then operate on that as if it were a list of characters; however, they generally return the result as a string, not a list of characters. This difference is significant: when converting to an integer, the string “47” converts to 47, but the list { “4”, “7” } converts to 4 + 7 = 11.

Booleans
Some instructions treat an item as a boolean. These generally convert them to an integer and then compare against 0. 0 is considered false, everything else true. Note this has the following consequences:


 * All floats greater than -1 and less than 1 are considered false, as are NaN and ±∞.
 * Almost all strings are false. The only strings that are true are those that encode a valid non-zero integer.
 * A list containing items that add up to 0 is also considered false, even if those items are not themselves all false (for example, the list { −1, 1 }).

Instructions
Syntactically, there are four types of instructions:


 * Singular instruction: A singular instruction is a lexical unit. The instruction syntactically stands by itself and does not form a structure with other instructions. Every instruction that is not explicitly marked otherwise is a singular instruction.
 * Block head instruction: These instructions start a block, which is a nested sequence of instructions that may be executed conditionally or repeatedly. The block must ultimately be terminated by a block-end instruction (終), although there may be multiple blocks separated by a condition-block instruction (況) or an else instruction (不 or 逆).
 * Condition-block instruction: The 況 instruction can be used to specify code to execute at the beginning of every iteration of a while loop (generally used to compute a terminating condition). This instruction terminates the condition block and starts the primary block. For example, in the code 套 c 況 p 不 e 終, the c block is executed at the beginning of every iteration; p is the primary block, only executed if c returned true; and e is the else block, only executed if c returned false already in the first iteration.
 * Else-block instruction: Most block instructions may have an else block which executes only if the first block was not executed. A block instruction is allowed to have such an else block unless explicitly stated otherwise. The else instruction terminates the primary block and starts the else block, which must be terminated by the block-end instruction (終).
 * Block-end instruction: The 終 instruction, which lexically terminates a block.

It is a compile-time error to have a block head instruction or an else instruction without a matching end instruction.

Loops and conditionals
The instructions listed in this table are all block instructions which cause their associated block to be executed conditionally or repeatedly. Block instructions in this list may optionally have either type of else block unless explicitly stated otherwise.

Where these descriptions describe an item as true or false, they refer to the boolean value of an item as described in the section “Conversions”. An item is empty if it is either the empty list, or it is not a list and converts to the empty string. This means the mark (標) is empty, but integers and floats are not.

Arithmetic
These instructions generally take integers or floats as arguments. When they encounter any other datatype, they will convert items to floats or integers as described in the section “Conversions”. In the following descriptions, “number” (and N in the stack transition) means “integer or float”. If the description and stack transition mention only “integer”, floats are accepted but are rounded toward zero.

Divisions by zero (including modulo zero) return NaN.

Logic
These instructions generally deal with boolean logic. Descriptions that refer to true or false refer to the boolean value of an item as described in “Conversions”.

Lambda function instructions
Instructions marked with a “█” are block head instructions; all others are singular instructions.

Lists and strings
These instructions all operate on a list or a string. Anything that is not a list or string is converted to a string as described in the section “Conversions”.

Instructions to operate on a specific index
The following instructions all operate on a specific item in a list or character in a string, identified by an index counting either forward from the start or backward from the end of the list/string. The semantics of each instruction are the same as those listed under “see also” at the end of the table except that those take an index from the stack.

Other list/string manipulation instructions
Instructions in this table marked with a “█” are block head instructions; all others are singular instructions.

String manipulation only (no lists)
Sclipting has built-in support for regular expressions. All regular expression instructions in Sclipting default to the behaviour normally referred to as “single-line mode” (Perl modifier s), which means that the. operator matches any character. The behaviour can be altered within a regular expression by using the (?-s:...) construct to disable “single-line mode” (so . matches anything but newlines). The same construct can be used to enable “multi-line mode” (m; alters the meanings of ^ and $ to match the beginning/end of a line within the string instead of the beginning/end of the whole string), “ignore case” (i) and “ignore whitespace” (x).

Instructions in this list marked with a “█” are block head instructions. Instructions marked with a “ʘ” can only be used within the main block of one of those block instructions. If several blocks are nested, the “ʘ” instructions apply only to the innermost block.

Hello, World!
 丟낆녬닆묬긅덯댦롤긐

99 bottles of beer on the wall
 丟눰標下標❷❶냦및嗎긆깯덇끬뉐❷貶是댰終긆뭦긆깥뉗긠닶먠덆둥긇덡닆렬겠 併標❸❶냦및嗎긆깯덇끬뉐❷貶是댰終긆뭦긆깥뉗긮겠併❸녆굫뉒걯닦넠뉆뭷닢렠댆굳댲걩덂걡댦뭵닦뀬겠 ⓵끶묠덆묠덆둥긇꽴닷깥껂걢덗딠댶뭭뉒걭닷깥껀밊嗎⓸倘貶不꾓밉終倘긆깯덇끬뉐 ❷貶是댰終不냦묠눦뭴덆롥댰終긆뭦긆깥뉗긠닶먠덆둥긇덡닆렮겠밊終

Interpreter

 * Sclipting is implemented in Esoteric IDE.