User:Icecream17/Grammar Explanation

From Esolang
Jump to navigation Jump to search

This is a based off of https://tc39.es/ecma262/#sec-notational-conventions, which is similar to https://orteil.dashnet.org/randomgen/?do=create

{{User:Icecream17/Grammar| grammar text }} when specifying things
{{User:Icecream17/Sans serif| text }} for sans-serif
}} text {{User:Icecream17/Grammar| in the following:

  • tables

Templates

See User:Icecream17/Variable, User:Icecream17/Algorithm, User:Icecream17/Sans serif, User:Icecream17/Grammar.

The following content amazed me because it took me so long to understand it that it felt like genius

From the amazing ecmascript specification:

5.1.1 Context-Free Grammars

A context-free grammar consists of a number of productions. Each production has an abstract symbol called a nonterminal as its left-hand side, and a sequence of zero or more nonterminal and terminal symbols as its right-hand side. For each grammar, the terminal symbols are drawn from a specified alphabet.

A chain production is a production that has exactly one nonterminal symbol on its right-hand side along with zero or more terminal symbols.

Starting from a sentence consisting of a single distinguished nonterminal, called the goal symbol, a given context-free grammar specifies a language, namely, the (perhaps infinite) set of possible sequences of terminal symbols that can result from repeatedly replacing any nonterminal in the sequence with a right-hand side of a production for which the nonterminal is the left-hand side.

Back to the article

Here a `non-terminal` is just `literal text`, that's what "drawn from a specified alphabet" means.

A `production` has a `left-hand-side` or `terminal`. Let's actually call it the `name` of a production.

A `production` also has a `right-hand-side`, which is zero or more `name`s and `literal text`s.

A `grammar` is a bunch of `productions`. (You don't need to worry about `context-free` because I was too lazy to describe grammar that have context)

So the last sentence above is now:

Starting from the `name` of a single distinguished `production`, called the goal symbol, a given grammar specifies a language, namely, the (perhaps infinite) set of possible sequences of `literal text` that can result from repeatedly replacing any `name` in the sequence with the right-hand-side of a `production` for which the `name` is the `left-hand-side`.


In this situation however, `productions` can be replaced by multiple `right-hand-side`s, called `alternatives`.

The explanation

Literal text is bolded, for example: if
You can also use replace a character with its' name as long as you surround it in angle brackets: <Space>

Here's some additional literals:

<Empty> an empty string of text
<Any> any character
<Newline> https://tc39.es/ecma262/#prod-LineTerminatorSequence
<EOF> end of file (e.g. no more source code)

A literal can also just be a sans-serif description. For example you could say: Any capital letter .

Each production has an italic name (in square brackets) and one or more distinct alternatives; and also optional context (in parens)

Each alternative is a list of zero or more names and literals

Productions can also have context passed onto them, but I don't need them yet so I'm not defining them


Let's define a comment. Python's comments are pretty simple:

[Python comment]

  • # [Remaining text of line]

[Remaining text of line]

  • (Any but not [Endline]) [Remaining text of line]
  • (Any but not [Endline]) followed by [Endline]

Here, [Python comment] is a hashtag followed by [Remaining text of line]

As you can see, [Remaining text of line] has a recursive definition in alternative 1.
This makes [Remaining text of line] become a bunch of (Any but not [Endline]),
where the last one is (Any but not [Endline]) followed by [Endline]

I'm going to introduce even more stuff

but not matches any A that is not B
followed by matches A as long as it's followed by B
one of used when you don't feel like making a list

[Endline] can be used in all grammars, even ones that doesn't explain what it is.
Its' alternatives are:

  • <Newline>
  • <EOF>

Eventually, with this unnecessary syntax, you'll define every token. Bam, a language.