User:Yayimhere/XeReg
XeReg is an esolang based on Regex which has the string operators of Cirt e mys, as well as other stuff. It was created specifically to be able to implement Cirt e mys. It is special in that matched strings can be "Modified" for other matches later in the string.
Semantics
A program is a XeReg match pattern. The program takes a single input, and returns the subset of that input string which was matched, unless if it was empty, in which it returns the program itself.
Tokens
The following is the list of tokens in XeReg:
.: matches any character, other than the empty string.(a ~ ...): capture group, with the namea(single letter). Not matched within the string. It matches independently on the input string and is equal to the substring matched. Cannot be redefined, new names are generated by appending a', so for example,(a ~ o)(a ~ oo)(a ~ ooo) -> (a ~ o)(a' ~ oo)(a ~ ooo) -> (a ~ o)(a' ~ oo)(a' ~ ooo) -> (a ~ o)(a' ~ oo)(a'' ~ ooo). The characters within are remembered._a: match capture group a.a+: 1 or more instances of a that are all followed by each other. This, and all other multi character matches, match until the item after them matches. Matches greedily.a(x)±: 1 or more instances of a, that may be separated by any characters inx.a*:+, but 0 or more.a(x)@:a(x)±, but 0 or more.·: equal to.+.÷: equal to.*.[f ~ x @y]: creates a "function". These are simply shorthands for another regex expression.xis an input,yis the body, and is referenced as{f.x}.ømay be used instead of any inputs, which simply makes it always be replaced with the bodyy, and are called as{f.}. Example:[· ~ ø @.+]. May be recursive.<: start of string.$: end of string.a¯: matches everything not matched bya.a|b: matches eitheraorb.a': matches capture groupa, but with its order of tokens reversed.a^: matches capture groupa, however ignoring its first token. Matches the empty string ifahas less than 2 tokens.a-: matches the first token of capture groupa.x+y:xandyconcatenated.?x(y|z): if previous actually matched character wasx, it will returnyand further evaluation will continue, else it returnsz.- space: separates tokens.
\x: escapesx./a: matches capture groupa, however with the order defined being ignored, and always chooses the match that matches the most of the string(using ascii ordering)#: Is equal to the latest's defined capture group's name.≠: Immdetily stops further matching.{a - x}: begins a captured match, where it runs the subprogramx, but with its input string beinga. returns the matched substring.
Evalutation goes from the left to right. Every other character matches itself.
Examples
The following code's do not match any strings:
[! ~ ø @.!] !
or:
.*¯
Computational class
XeReg is Turing complete, by compilation from BCT:
For some program p, first these replacements must take place:
0 -> (# ~ #^)
1x -> {# ~ 1 ?1((# ~ #+x)|(#))}
with spaces between each command. Then, the new program is appended to the program (a ~ .+) [p ~ ø @{ # - , and then a space is appended to the resulting program, and then # ?1|0((# ~ #)|≠) p } ]. The input should be the data string.