PDAsephtwo
PDAseptwo is an extension of PDAsephone by User:BoundedBeans, made in January 2025.
Storage
PDAsephone normally has two stacks, a stack of characters and a stack of instances of push-down automata. Each push-down automaton has a stack of characters, a current state, which is a character, and a mapping of state/stack/input character triplets to transitions. A transition is composed of a boolean value (whether to pop from the stack), a character to push on the stack, and a state to change to. Notably, the newline character cannot occur on the stack, and is only used to signify an empty stack or not to push anything.
PDAsephtwo has all this, but also a bit more.
- There is now also a queue of push-down automata. There is no queue of characters, but you can store characters inside of push-down automata, or use two stacks.
- Characters are now standardized to be 16-bit Unicode codepoints. This means that there is no handling for characters outside the Basic Multilingual Plane, and surrogate characters are handled separately. Also, every character actually used in PDAsephtwo storage objects has both the main codepoint and an arbitrary-length string of codepoints called the "diacritics". These can be any 16-bit codepoint; they are not restricted to only actual diacritical marks or combining characters, though the ones in the main Combining Diacritical Marks block from U+0300 to U+036F can be typed directly while others need to be constructed from commands.
- There's a map of characters that have main codepoints from
ABCDEFGHIJKLMNOPQRST
and a non-empty diacritic string, to strings of characters. These are used for custom commands, and the strings will be executed when an equal character is encountered in a command string.
All of this is now stored in an "execution environment". Execution environments can now be attached to a push-down automaton instance, replacing its normal storage and allowing custom behavior. When an execution environment is stored in a PDA, the execution environment containing that PDA is referred to as the "parent".
Syntax
Syntax now follows a more rigid structure. Since diacritical marks can be part of a command, there is now a new way that characters are parsed, following this procedure in order or something equivalent:
- The code is read line by line using some input mechanism. Each line is considered separately. At this stage it is considered as a string, not as a sequence of commands.
- This means that
"
commands cannot use a newline as an argument (use_
for that). Technically,"
is a character by itself and its argument is considered a separate command when parsed, but is skipped by the"
command.
- This means that
- Whitespace is trimmed from the start and end of each line.
- If a line is equal to
>>>>[
, then everything up to a matching line equal to>>>>]
is removed). These match like brackets, and can be nested. This is a multiline comment mechanism. - For each line, any instances of the regex
>>>>@[^@]*@
are removed. - For each line, if the line contains the text
>>>>
, then that and everything following it is removed. - Each line is grouped into matches of the regex
[^\u0300-\u036F][\u0300-\u036F]*
. The matches are then converted into character instances with the first character as the main codepoint and the rest as its diacritics. Other character diacritics can be constructed with commands.
Common operations
When this specification says to pop a "terminated decimal", it means to pop a series of Latin decimal digits 0-9, terminated by a semicolon (ignoring diacritics for all of these), and interpret it as a number.
When this specification says to pop a "string", it means to pop a series of .
characters with diacritic strings composed of a first character representing a main character, and the remaining characters (if any) representing a diacritic string, terminated by a ;
character. This is interpreted as a string of characters.
If the stack is popped when empty because of missing arguments, the PDA-stack is repeatedly popped until it either becomes empty, or a PDA with an execution environment is at the top of the stack. If the PDA-stack is empty after this, the program exits. Otherwise, "@!
gets run as an error-catching mechanism, then the program continues from there.
Commands
Lowercase letters from a
to t
with non-empty diacritic strings are reserved for extension commands.
@ |
Pushes a new pda onto the pda stack. It will have an empty stack, no transitions, will be in state '0', and will not have an execution environment. |
% |
Pops u, v, w, x, y, and z off the character stack. Installs the transition on the top pda:
If input u, state v, and stack returns w, pop if x isn't Errors:
|
! |
Run a transition on the top pda, with input of the top character on the stack, popping that character
Errors:
|
^ |
Pop a character off of the stack of the top pda, push it onto the character stack. If the pda's stack is empty, push a newline.
Errors:
|
. |
Print the top character on the character stack to the console |
, |
Input a character, push to the character stack |
" |
Move the instruction pointer 1 forward, and push that character onto the character stack. In other words, push the following character. |
_ |
Push a newline onto the character stack |
v |
Pop a character off of the character stack, push it onto the top pda's stack. If the character is a newline, push a space instead |
: |
Duplicate the top element on the character stack |
; |
Duplicate the top element on the pda stack. Note: If pda's are references in your implementation, this operation should copy the contents. They should not reference the same object; changing one should not change the other
Errors:
|
/ |
Swap the top two characters |
\ |
Swap the top two pdas
Errors:
|
¥
|
Pop a character A that is either + or - , pop a terminated decimal B. If A is + , move the top element on the character stack B places down. If A is - , move the Bth element from the top of the character stack to the top.
Errors:
|
¶
|
Pop a character A that is either + or - , pop a terminated decimal B. If A is + , move the top element on the PDA stack B places down. If A is - , move the Bth element from the top of the PDA stack to the top.
Errors:
|
$ |
Discard the top character |
# |
Discard the top pda
Errors:
|
| |
Weird go-to. This pops the top character off of the stack. If it is a capital letter, move the instruction pointer forward until that character is reached not following a " command
(if it runs into the letter following quotes, it will continue looking).
If it is a lowercase letter, move backwards. You will need to use multiple of these to get around sometimes.
Only the letters |
+ |
Make a reference to the top PDA on the stack, and push it (don't pop the original). References can have the same exact commands applied as normal PDAs, and for all intents and purposes are considered regular PDAs. Any changes made to this should apply the same changes to all things that reference the same PDA.
Errors:
|
u |
Pop a terminated decimal. If it is in the range 0-65535, push the character with that as the Unicode code-point.
Errors:
|
U |
Pop a character A and a character B. Add B to the end of A's diacritic list, deleting any pre-existing diacritics in B. |
? |
Pop a character, run the command corresponding to it. |
Ù (U with diacritic string [U+0300])
|
Pop a character A, push a character with main codepoint . and a diacritical string composed of A's main codepoint followed by A's diacritic string.
|
Ú (U with diacritic string [U+0301])
|
Pop a character A and a character B. Push a new character with A's main codepoint and B's diacritic string. |
Û (U with diacritic string [U+0302])
|
Pop a character A, which should be a . character as described in the Ù (U with diacritic string [U+0300]) command. Converts it into its full form, like undoing the aforementioned command.
Errors:
|
Ũ (U with diacritic string [U+0303])
|
Pop a character A, push A with the first codepoint in the diacritic string removed, push the codepoint that was removed.
Errors:
|
W
|
Pop a PDA, push its state. |
w
|
Pop a PDA and a string. For each transition in the PDA, push u, v, w, x, y, z in the same format as the % command uses (with 1 as the opposite of 0 used by x), and run the string.
Errors:
|
V
|
Pops a PDA, and pops 6 strings, A1, A2, A3, A4, A5, and A6. A1 becomes the function run on PDAs as % , A2 becomes ! , A3 becomes ^ , and A4 becomes v , A5 becomes W , and A6 becomes w .
Errors:
|
¢
|
Pop a PDA, push the 6 strings in the same order (it should not be reversed) as V .
Errors:
|
X
|
Pop and enqueue the top PDA to the queue.
Errors:
|
x
|
Dequeue a PDA from the queue and push it onto the PDA stack.
Errors:
|
Ū (U with diacritic string [U+0304])
|
Pops a character A with main codepoint from ABCDEFGHIJKLMNOPQRST and a non-empty diacritic string, pops a string B. Maps this as a custom command. The characters %!^vWw with empty diacritics can also be used to modify the strings of the current PDA's execution environment.
Errors:
|
U̅ (U with diacritic string [U+0305])
|
Pops a label character. Tells the parent execution environment to weird-goto that label when it gets returned to. |
Ŭ (U with diacritic string [U+0306])
|
Pops a custom command character, pushes the string it's mapped to.
Errors:
|
Θ
|
Pop a PDA, push it to the PDA stack of the parent execution environment.
Errors:
|
θ
|
Pop a PDA from the PDA stack of the parent execution environment, push it to this execution environment.
Errors:
|
ψ
|
Pop a string, execute it in the execution environment of the top PDA.
Errors:
|
Ψ
|
Pop a string, execute it in the parent execution environment.
Errors:
|
Computational class
This is a strict superset of PDAsephone, and PDAsephone is Turing-complete by simulation of Brainfuck minus - (though both PDAsephone and PDAsephtwo can be used much more efficiently outside the proof construction), so PDAsephtwo is Turing-complete.