Jump to navigation Jump to search
Questions about the specs
- Why don't you separate the initial sring from the parts by a part separator
;, and perhaps distinguish type I parts by a trailing rule break? You only have one initial string per program, not one per part, so that seems more logical. You'd then need some way to disambiguate between parts and initial string, eg. make the initial string required, or make an empty last part invalid, or require writing an empty initial string as eg.
- Can you clarify on how rules are preprocessed if metacharacters appear in them? Here's how I imagine this works but please tell if that's not what you're thinking?
- First, add meta-parenthesis around balanced brackets in each of the match and the replacement.
- Second, turn the match and replacement to parse trees using the precedence rules and meta-parenthesis.
- Third, preprocess the rule by copying the whole rule for each possible value of the six wildcards
.'"_-=, with each kind of wildcard standing for the same string in the entire rule, and adding meta-parenthesis around the substituted string.
- Fourth, copy the whole rule for each possible choice made for the
!?+*|. This time the choice is made separately for each instance of the operators, and from outside in, so that in
(a*b)*, each repeat of
a*bcan choose different numbers of repetition counts of
- Fifth, resolve each
&#operator by dropping the rule if its sides don't match. At this point the two sides of
&#only contain concatenations and characters, and they will be compared as a string, not as a tree.
- How does the
~operator work? Do you disambituate its meanings between the first and second step, then apply it in the fourth step?
- Is the example “
a~b~d~e” supposed to say “
- Why don't you add implicit concatenation to the precedence table, between
&, to make it easier to understand?
- What does “part” mean in any “parts of the data string which were produced by a replacement in this round cannot themselves be replaced until the next round”? Is it only characters? Or would an empty string as the replacement block matching a sequence of characters if some of them are before and some are after that position?
– b_jonas 15:09, 24 October 2018 (UTC)
- Huh? The initial string is separated from the parts by a part separator. It's just that it's possible that there won't be one at all. (The method of signifying Type I / Type II distinction is taken from one of the predecessor languages I was working on, although in that language they were the other way round; I swapped it because Type II is likely to be much more common.)
- The algorithm you give would take an infinite length of time, but it's correct in that it should produce the right results if it does run (presumably you'd run it in parallel with other things).
~disambiguates by looking to see if it's next to literal lexemes.
a~b~d~edoesn't make any sense (
d~ecannot resolve to a number unless you're using a number system like hexadecimal), and
a!b!d!eis almost certainly what I meant.
- Good idea.
- I didn't think of that (it didn't come up in any of the situations that inspired that rule). Logically speaking, though, it seems consistent for an empty string replacement to block matches until the next round.
- --ais523 17:43, 24 October 2018 (UTC)