WhatLang

From Esolang
Jump to navigation Jump to search

WhatLang is a stack-based programming language created by User:YufangTSTSU (Yufang). Written in TypeScript, its first interpreter can be installed as a private (not published on npm) plugin for Koishi, a bot framework for QQ and other instant messaging platforms, and used by invoking the bot command whatlang or simply sending the code prefixed with a '¿'.

I (User:DGCK81LNN) have been working on a new interpreter for the language (still written in TypeScript) that fixes bugs, internal errors and security problems in the original one.

Mechanics

WhatLang has four types of values: String, Number (64-bit float including NaN), Array, and Undefined. If two occurences of Strings have the same content, they are the same String. Only Arrays are mutable, and two Arrays may have the same contents while being different objects.

The virtual machine has a stack of stacks, known as the Frame Stack. The topmost stack on the Frame Stack, aka. the Stack, is the one that your code usually interacts with. Values are pushed to and read or popped from the top of the Stack. There is also an internal dict which stores named variables.

In the Koishi runtime, there also exists an Output Stack, which can hold fragments of text, images, audio messages, videos, attachments, and quote references. Each time something is printed, it is pushed onto the Output Stack. send@, sends@ or sendsto@ can then be used to pop items from the Output Stack and send them immediately. When the program ends, all remaining items on the Output Stack are sent in the current chat, in the order they were pushed.

For the purpose of this documentation, to return something means to push it onto the Stack. Popping from the Stack when it is empty yields Undefined, without altering the Stack. When popping multiple values, they are listed from bottom to top. If something "must" satisfy a certain constraint, the interpreter's behavior is undefined when it does not.

Instructions

Instruction Name Description
0 Zero[1] Return 0.
A 1-9 digit followed by zero or more 0-9 digits[1] Integer literal Return the literal Number.
An ASCII letter, followed by zero or more ASCII alphanumerics and/or underscores Identifier string literal Return the identifier as a String, converted to lower case.
' followed by one UTF-16 unit Char literal Return the char as a 1-length String.
" delimited text Quoted string literal Return the text as a String. Line feeds and character tabulations can be escaped as \n and \t. A backslash otherwise forces the next character to be treated literally.
` delimited text Literal print Similar to a " delimited string, but prints the string without doing anything with the Stack.
+ - * / % Arithmatic operation and string concatenation Pop a and b from the Stack and return a <operator> b. The operators function exactly the same way as in JavaScript.
? Compare Pop 2 values from the Stack. If they are loosely equal (== in JavaScript; two occurences of Arrays need to be the same Array to be considered equal), return 0; if bottom is greater than top, return 1; if bottom is less than top, return -1; otherwise, return NaN.
~ Logical not Pop 1 value from the Stack. If it is the empty String, 0 or Undefined[2], return 1; otherwise return 0.
[ New Stack Push a new empty Stack onto the Frame Stack.
| Open Stack Pop 1 value from the Stack; throw an error if it is not an Array. Otherwise push it onto the Frame Stack, making it the new Stack.
] Close Stack Pop the Stack from the Frame Stack. If the Frame Stack becomes empty, push a new empty Stack onto it. Then push the popped Stack onto the current Stack as an Array.
() delimited text (may include nested parens) Parenthesized string literal Return the literal contents of the parens as a String.
. Print Print the element at the top of the Stack without popping it, or Undefined (as "undef") if the Stack is empty.
\ Swap Swap the topmost two elements in the Stack. Does nothing if the Stack contains less than 2 values.
: Duplicate Push the Stack's topmost element onto the Stack again (without copying it if it is an Array; a reference to the same Array object is pushed). Does nothing if the Stack is empty.
& Bury[3] Pop 1 value from the Stack, then insert it at the bottom of the Stack. Does nothing if the Stack is empty.
_ Pop Pop the topmost element from the Stack.
= Set named variable Pop 1 value from the Stack, which must be a String. Set the value of the variable named this string to the topmost remaining element of the Stack, or Undefined if there is none left.
^ Get named variable Pop 1 value from the Stack, which must be a String. If a variable named this string exists, return its value. Otherwise, if the string is the name of a builtin function, return the string plus "@". Otherwise, return Undefined.
@ Call or eval Pop 1 value from the Stack, which must be a String. If the string is the name of a builtin function, call the function. Otherwise, if the string is the name of an existing variable[4], the variable's value must be a String; execute the value as WhatLang code. Otherwise, execute the string as WhatLang code.
> Gather Pop n from the Stack, coerced into an integer. If n is positive, remove the topmost n elements from the Stack. Otherwise, remove all but the bottommost |n| elements from the Stack. Then, return a new Array containing the removed elements.
< Spread Pop 1 value from the Stack, coerced into an Array. Push each of its elements onto the Stack.
{ While loop start Pop 1 value from the Stack. If it is the empty String, 0 or Undefined[2], jump to the corresponding }.
} While loop end Pop 1 value from the Stack. If it is not the empty String, 0 or Undefined[2], jump to the corresponding {.
One or more consecutive !s Break, return or halt Break out of [number of !s] levels of {}. If not inside of {} or there are more !s than current levels of nested {}s, return from the currently executed code if run by the @ (or #) instruction, or halt the program otherwise.
# Map Array Pop func from the Stack. The remaining topmost value in the Stack must be an Array; call it items. Let results be a new empty Array. For each element item in items: create a shallow copy of the Stack and call it tempstack; push item and func onto tempstack; with a new stack, containing only tempstack, as the Frame Stack, run the @ instruction; then, push the topmost element in tempstack, or Undefined if it is now empty, into results. Finally, return results.
, Get Array item Pop n from the Stack, coerced into an integer. If the remaining topmost value in the Stack is a String, it is treated like an Array with each UTF-16 unit in it as a 1-length String element; otherwise it must be an Array. If n >= 0 and n < length of the Array, return the item at index n of the Array; otherwise, if n < 0 and n > -length, return the item at index length + n of the Array; otherwise, return Undefined.
; Set Array item Pop n and value from the stack. Coerce n into an integer, but leave it as NaN if it was NaN or cannot be interpreted as a proper number. The remaining topmost value in the Stack must be an Array; let length be its length. If n is equal to length, or is equal to -1[5], or is NaN, push value to the end of the Array. Otherwise, if n >= 0 and n < length, set the item at index n of the Array to value; otherwise, if n < 0 and n > -length, set the item at index length + n of the Array to value; otherwise do nothing.
$ Delete Array item Pop n from the Stack, coerced into an integer. The remaining topmost value in the Stack must be an Array. If n >= 0 and n < length of the Array, remove the item at index n of the Array; otherwise, if n < 0 and n > -length, remove the item at index length + n of the Array; otherwise do nothing. All items following a removed item are moved forward by one cell.
  1. 1.0 1.1 The integer literal used to allow leading zeroes. The Zero instruction was added by DGCK81LNN and currently absent in the GitHub source code. Each leading zero now pushes a 0 onto the stack, which allows you to write 01- to get a -1 which used to require a space after the zero.
  2. 2.0 2.1 2.2 NaN is considered truthy, unlike in JavaScript.
  3. This instruction is currently missing in the GitHub source code.
  4. In the new interpreter implementation, variable names will have to begin with an ASCII lowercase letter and contain only ASCII lowercase letters, digits and/or underscores, or they will be interpreted as literal code.
  5. Treating index -1 like NaN is a quirk of Yufang's original interpreter. In the new interpreter, using index -1 properly sets the last item of the array.

Conversion and coercing

To format a value into a String:

Value Result
String Surround the string content with double quotes and escape backslashes, double quotes, line feeds (\n) and character tabulations (\t) in it.
Undefined "undef"
NaN "NaN"
Inf "Inf"
-Inf "-Inf"
finite Number as done natively by JavaScript
Array Format each element to a String; join the results with ", " and surround that with "[" and "]". When a circular Array reference is detected, replace it with "[...]".

To convert (coerce) a value into a String, if it isn't already one, format it into a String.

To convert (coerce) a value into a Number, if it isn't already one:

Value Result
String as done natively by JavaScript; possibly resulting in NaN if the String cannot be interpreted as a number
Undefined NaN
Array 0, if the Array is empty; the result of converting its element at index 0 into a Number, if its length equals 1; NaN otherwise

To coerce a value into an integer, convert it into a Number; then, if it is NaN, result in 0; otherwise, truncate the Number's fractional part if it is finite. Leave Inf or -Inf as is.

To convert (coerce) a value into an Array, it must be either a String or an Array. If it is a String, the result is a new Array containing each character (or unpaired UTF-16 surrogate) in it as a separate String; if it is an Array, the result is a shallow copy of it.

Builtin functions

WhatLang has core builtin functions and Koishi runtime specific builtin functions.

Core

Function Description
num@ Pop 1 value, convert it to a Number and return the result.
str@ Pop 1 value, convert it to a String and return the result.
repr@ Pop 1 value, and return a string that tries to recreate the value when executed as WhatLang code.
arr@ Pop 1 value, convert it to an Array and return the result.
pow@ Pop a and b, coerced into Numbers, and return the result of a ** b.
band@ bor@ bxor@ Pop 2 values, coerced into signed 32-bit integers, and return the result of performing bitwise AND, OR, or XOR between them, respectively.
bnot@ Pop 1 value, coerce it into a signed 32-bit integer, and return the result of performing bitwise NOT on it.
rand@ Return a random number between 0 and 1.
randint@ Pop 2 values, which must be Numbers. Return a random number between them, rounded down to an integer.
flr@ Pop 1 value, coerced into a Number. Return the result of rounding it down to an integer.
range@ Pop n, coerced into an integer. Throw an error if n is negative or greater than 4294967295. Otherwise, return a new Array containing every integer from 0 to n.
len@ Throw an error if the topmost value in the Stack is Undefined. Otherwise, return its length, if it is either a String or an Array, or Undefined otherwise.
split@ Pop string and separator, both coerced into Strings. Return the result of spliting string into an Array of Strings at each occurrence of separator, or an Array of every UTF-16 unit in string if separator is empty.
join@ Pop 1 value, coerced into a String. Return the result of joining the remaining topmost element in the Stack, which is coreced into an Array, with the String; each element in the Array is coerced into a String.
reverse@ Convert the topmost element in the Stack into an Array. Reverse the Array and return it.
in@ Pop value. The remaining topmost element in the Stack must be either an Array or a String; convert it into an Array if it isn't already one. If value is in the Array, return the index of its first occurence. Return -1 otherwise.
filter@ Pop func. Convert the remaining topmost value in the Stack into an Array; call it items. Let results be a new empty Array. For each element item in items: create a shallow copy of the Stack and call it tempstack; push item and func onto tempstack; with a new stack, containing all but the topmost stack in the current Frame Stack followed by tempstack, as the new Frame Stack, run the @ instruction; then, push item into results, unless tempstack is now empty, or its topmost element is the empty String, 0, NaN[1] or undefined, in which case do nothing. Finally, return results.
chr@ Pop 1 value. If it is an Array, each of its elements is coerced into an integer; throw an error if any element is negative or greater than 1114111; otherwise return a string composed of these integers as codepoints. If the value is not an Array, treat the it like an Array with the value as the only element and follow the same instructions.
ord@ Pop 1 value, coerced into a String. Return each codepoint (or unpaired UTF-16 surrogaate) in the String as a Number.
and@ Pop a and b. If a is the empty String, 0, NaN[1] or undefined, return a; otherwise return b.
or@ Pop a and b. If a is the empty String, 0, NaN[1] or undefined, return b; otherwise return a.
nan@ Return NaN.
undef@ Return Undefined.
inf@ Return Inf.
ninf@ Return -Inf.
eq@ Pop 2 values. If they are strictly equal (=== in JavaScript), return 1; otherwise return 0.
stak@ Return the Stack as an Array. Note that this makes the Stack contain itself.
stack@ Return a shallow copy of the Stack as an Array.
try@ Like the @ instruction, but return a new Array containing the error name and message if a runtime error occurs while executing. If no error occurs, return an Array containing 2 elements which are both Undefined.
throw@ Pop 1 value, which must be a String. Throw an error with the String as the message and "Error" as the name.
match@ Pop string and pattern: throw an error if string is not a String; pattern must be either a String or an Array. If pattern is an Array, it must have at lease one Element; element 0 must be a String, and element 1, if present, must be either a String or Undefined. If pattern is a String, it is treated like an Array with the String as its only element. If the Array's elements are not valid arguments to JavaScript's RegExp constructor, throw an error. Return the result of executing a new RegExp, made from pattern, on string (an Array containing the whole matched string and the substrings matched by each capture group, or an Array containing each match if the pattern contains no capture groups and has a g flag), or an empty Array if nothing can be matched.
repl@ Pop string, pattern and replacement: throw an error if string is not a String; pattern must be either a String or an Array; replacement must be a String. Make an RegExp from pattern in the same way as in match@, then return the result of replacing patterns matching the RegExp on string with replacement. Backreferences are replaced in the replacement string.
time@ Return the current system time in epoch milliseconds.
type@ Pop 1 value; return its type (one of "String", "Number", "Array"), but throw an error if it is Undefined[2].
  1. 1.0 1.1 1.2 NaN being treated as falsey in these specific cases is a quirk of Yufang's original interpreter.
  2. Throwing an error when doing type@ on Undefined is a quirk of Yufang's original interpreter; the new interpreter returns "Undefined"

Koishi runtime specific

Function Description
help@ Pop 1 value, which must be either a String or Undefined. If it is the empty String or Undefined, return a brief introduction to the language and some instructions on using the help@ function, as a String. Otherwise, if a help topic exists with the String as the title, return its contents as a String. Otherwise, return a message indicating that the specified help topic was not found, as a String.
helpall@ Print a list of all builtin functions as an image.
pr@ Wait for the user who invoked the interpreter to send another message. Return the message contents in XML as a String, or Undefined if it exceeds Koishi's default prompt timeout.
propt@

(more documentation upcoming...)

Example programs

Programs here are prefixed with ¿ since that's how you usually invoke the interpreter bot on a messaging platform.

Hello, world!
¿`Hello, world!`
Quine
¿(`¿(`.`) `.) `¿(`.`) `.
Cat program Get random cat image from TheCatAPI
¿(https://api.thecatapi.com/v1/images/search) cat@ ("url":"(.+?)") match@1, outimg@

This utilizes a regular expression and Koishi runtime specific builtin functions.

Repeat the user's next message — arguably the equivalent of a Cat program in the Koishi runtime
¿pr@ [(&lt;) g](<)repl@ [(&gt;) g](>)repl@ [(&amp;) g](&)repl@ .

This unescapes <>& in the input before outputting it. Note that it produces unexpected results when the user's message is / contains something like an image or a platform-specific emoji, because those are represented as XML tags.

Roll a dice (made by Yufang)
¿[
  (000,010,000)
  (001,000,100)
  (100,010,001)
  (101,000,101)
  (101,010,101)
  (101,101,101)
](
  ',split@(
    [(0)g]' repl@
    [(1)g]61496chr@repl@
  )#"│\n│"join@
  "╭───╮\n│"\+
  "│\n╰───╯\n"+
)#
0 6 randint@,
outksq@

This results in one of the following:

Get current datetime
¿
(2>|:&&:&\/flr@:&*-]<)divmod=_
(2>|
  3600000*+ 946684800000- 86400000divmod@&
  146097divmod@ :5+7%1+&
  :59- 36524/ flr@ :0?1?~{+0!}_ 36525divmod@ 1461divmod@
  :59- 365/ flr@ :0?1?~{+0!}_ 366divmod@
  1\ 1{ :31-:0?(-1)?~{!!} \_\1+\ :29-:0?(-1)?~{!!} \_\1+\ :31-:0?(-1)?~{!!} \_\1+\
        :30-:0?(-1)?~{!!} \_\1+\ :31-:0?(-1)?~{!!} \_\1+\ :30-:0?(-1)?~{!!} \_\1+\
        :31-:0?(-1)?~{!!} \_\1+\ :31-:0?(-1)?~{!!} \_\1+\ :30-:0?(-1)?~{!!} \_\1+\
        :31-:0?(-1)?~{!!} \_\1+\ :30-:0?(-1)?~{!!} \_\1+\ :31-:0?(-1)?~{!!} \_\1+\
    0!}_ 1+ &&
  \4*+ \100*+ \400*+ 2000+&
  3600000divmod@ 60000divmod@ 1000divmod@
])datetime=_
time@ 8 datetime@.

This defines two custom functions, divmod@ and datetime@, and then calls datetime@ with the current timestamp and a time zone offset (in this case GMT+8). It returns something like [2024, 9, 13, 5, 6, 55, 38, 92] (the fourth number represents the day of week, where 1 - 7 stand for Monday to Sunday respectively; the last number is the milliseconds), which is then printed.

External resources