grammar
Paradigm(s) | String-rewriting Paradigm |
---|---|
Designed by | User:Citrons |
Appeared in | 2022 |
Dimensions | one-dimensional |
Computational class | Turing complete |
Reference implementation | Unimplemented |
Influenced by | BNF, thue |
File extension(s) | .gram , .grm |
grammar is a language created by User:Citrons. a grammar program consists of a list of symbol replacement rules and an initial sequence of symbols. the execution of the program consists of repeatedly applying the rules in order until no replacements can be made, at which point the program ends.
syntax
the grammar of grammar can be represented as BNF (with certain obvious definitions omitted) like so:
<opt-whitespace> = " " | "\t" | "" <opt-newline> = "\n" | <opt-whitespace> <name-char> = <alpha> | <digit> | "_" | "-" <name> = <name-char> | <name> <name-char> <escape-sequence> = "\\" <any-char> <string-char> = <any-char-except-double-quotes> <string-chars> = <string-char> <string-chars> | <string-char> <char-literal> = "'" <string-char> <string-literal> = "\"" <string-chars> "\"" <symbol> = <name> | <char-literal> | <string-literal> <replacement-symbol> = <symbol> | "?" <nonempty-replacement> = <opt-whitespace> <replacement-symbol> <opt-whitespace> | <sequence> <replacement-symbol> <opt-whitespace> <replacement> = <nonempty-replacement> | "" <repetition-oper> = "*" | "+" <repetition> = <expr> <repetition-oper> <opt-whitespace> <expr> = <symbol> <opt-whitespace> | "[" <opt-whitespace> <pattern> "]" <opt-whitespace> | "{" <opt-whitespace> <pattern> "}" <opt-whitespace> | "(" <opt-whitespace> <pattern> ")" <opt-whitespace> | <repetition> <pattern> = <expr> <pattern> | <expr> "|" <opt-whitespace> <pattern> | <expr> <rule> = <replacement> "=" <pattern> <end-rule> = "\n" | ";" | "" <rules> = <rule> <end-rule> | <rule> <end-rule> <rules> <start-sequence> = <opt-newline> <symbol> <opt-newline> | <start-sequence> <symbol> <opt-newline> <program> = <rules> "." <start-sequence>
comments are made with the #
character; it and postceding characters are ignored until the end of the line.
patterns
the right side of a rule specifies a pattern of symbols to match for the rule. the simplest form of pattern is one which is a list of symbols, e.g:
foo = bar baz etc
this rule would replace any part of the sequence of symbols wherein the symbols bar
, baz
, and etc
appear consecutively with the symbol foo
.
the symbol _
in a pattern will match any symbol.
operators
the |
operator is used for alternation. it can be read as "or". for example, this rule would replace all instances of the symbol bar
OR any sequence of bee
followed by apioform
with the symbol foo
:
foo = bar | bee apioform
the postfix *
and +
operators allow patterns to match arbitrary repetitions of symbols. +
matches one or more instance, whereas *
matches even if there are no instances of the expression. this example would match any sequence of foo
followed by any number of bar
, including 0 bar
s:
foo = foo bar*
grouping
parenthesis can be used to enclose an expression to change precedence. for instance:
foo = (bar | bee) apioform foo = bar | bee apioform
these are two different rules. the former only matches a sequence if it ends with apioform
, whereas the latter only matches apioform
if the first symbol matched is bee
.
square brackets enclose an optional expression. if the expression within square brackets does not match, it does not prevent the pattern as a whole from matching.
captures
the pattern of a rule may contain expressions enclosed in curly brackets. these expressions "capture" the sequence of symbols they match. a question mark included in the replacement on the left side of the rule will expand to the captured symbols. the nth question mark in the replacement expands to the (n % number of captures
)th capture of the pattern. for example:
bee ? utterly ? = utterly apioformic {bees | apioforms | you | everyone} {char+}
this will translate the sequence of symbols utterly apioformic bees "abcdef"
to bee bees utterly "abcdef"
.
literal symbols
literal symbols represent single ASCII characters. literal symbols can be specified in the program using character literals ('c
), or an entire sequence of literal symbols can be specified as a string literal ("Hello, world!
). an empty string (""
) is an error.
there are three special symbols for manipulating literal symbols. these symbols will not behave normally in a program and will instead do special magic.
char
, when used in a pattern, will match any literal symbol. using it in a replacement is an error.stdin
, when used in a replacement, will read a character from standard input and place it as a literal symbol in its position. standard input is read once per rule containingstdin
such that if the pattern has multiple matches, or if a replacement contains multiple instances ofstdin
, it will resolve to the same character. if the standard input stream has reached EOF, then the matches are instead replaced with the symboleof
. usingstdin
in a pattern is an error.stdout
, when used in a replacement, will traverse the symbols of each match in order. for each symbol, if it is a literal symbol, it will write it to standard output; otherwise, it does nothing. stdout does not appear in the sequence. usingstdout
in a pattern is an error.
example programs
hello, world
stdout = char+ ."Hello, world!"
cat program
stdin = char+ stdout = char+ .'x
truth machine
start = stdin stdout = '0 stdout '1 = '1 .start
Bitwise Cyclic Tag
program ? 0 data ? = program 0 {(0|1)*} data (0|1) {(0|1)*} program ? 1 0 data 1 ? 0 = program 1 0 {(0|1)*} data 1 {(0|1)*} program ? 1 1 data 1 ? 1 = program 1 1 {(0|1)*} data 1 {(0|1)*} program ? 1 0 data ? = program 1 0 {(0|1)*} data 0 {(0|1)*} program ? 1 1 data ? = program 1 1 {(0|1)*} data 0 {(0|1)*} . program 0 0 1 1 1 data 1 0 1
todo
- make more examples
- implement the language
- potentially improve the language