L-system

From Esolang
Jump to navigation Jump to search

Lindenmayer systems, or L-systems, are parallel string rewriting systems. The concept was created by Aristid Lindenmayer to model biological growth.

Definition

An L system consists of a set of symbols, a set of rules, and is run against a starting string or axiom. Execution is performed by creating an empty string, stepping through the current string looking for matching symbols to write a symbol's production to the new string, and moving past the symbol. Productions map a symbol to any finite string of the alphabet, including the empty string. Once the current string is exhausted the step is finished and a new one can begin. This style of execution is similar to cellular automata, particularly 1D/elementary.

In tuple notation, an L-system can be denoted :

  • A set of symbols
  • A starting word/axiom
  • A transformation function
    • Made up of productions
      • Each production replacing a symbol to some finite length string, and may use some finite left and/or right context for the mapping
    • Transformation is done in parallel for all symbols at once

This L-system (D0L) models the Fibonacci sequence (observe the string lengths):

Symbols: {a, b}
Rules:
  a -> b
  b -> ab
Axiom: a
Step 0: a
Step 1: b
Step 2: ab
Step 3: bab
Step 4: abbab
Step 5: bababbab
...

Depending on the halting criterion and contextuality, L-systems can have different computational power.

A common halting criterion is that an L-system halts iff the production rules map the current word to itself. This is natural since there is no change that the word can undergo anymore. Words which do not change in a system are called adults, or adult words. The set of all adult words of a given L-system is its adult language.[1]

L-systems can be context-free or contextual. The example above is context-free, but this one is contextual (D1L):[2]

Symbols: {a, b}
Rules:
  b < a -> b
  b -> a
Axiom: baaaa
Step 0: baaaa
Step 1: abaaa
Step 2: aabaa
Step 3: aaaba
Step 4: aaaab
Step 5: aaaaa adult

The second rule is non-contextual, simply replacing b with a unconditionally. The first is contextual, it checks the left symbol of each a. If the left symbol is b, then it operates and replaces the a with b, otherwise it does nothing. We also note here the implicit identity rules. What about the case when a is next to a? Implicitly, if no rule matches for a given symbol in its context, the identity is assumed, mapping the symbol to itself.[2]

L-systems can be characterized by the contextuality of their production rules. Purely noncontextual L-systems like the Fibonacci example are called 0L. L-systems which can operate on 1 neighbor of context like the signal propagation example above are 1L. If both neighbors can be considered in a rule, the system is 2L. Deterministic L-systems can have the prefix D; D0L, D1l, D2L, etc.[2] Both examples are deterministic. An L-system which cannot delete a symbol/replace it with the empty string can be called propagating, with the propagating versions being P0L, P1L, P2L, etc.[1]

L-systems may be naturally nondeterministic, like many other generative formal grammars. If a single word is able to be mapped to multiple different words under a given L-system, the L-system is nondeterministic. Nondeterministic word mapping under an L-system will ultimately be due to nondeterministic symbol mapping rules. A clash between rules producing nondeterminism in an otherwise deterministic system may be called frustration.[3] Execution of a nondeterministic L-system is often performed stochastically, meaning a random possibility is chosen. The probability may be configurable, with every nondeterministic rule in a given clashing set being given a probability such that the sum is 1. Note that by convention if the nondeterminism is due to a clash between contextual and context free rules, the contextual rules win.[2]

Contextual and context free rules can be mixed in a given L-system. Context-free rules can be replaced with contextual ones by duplicating the rule for every possible context. Contextual rules are given higher precedence if a single replacement must be chosen.[2]

Since noncontextual rules may be replaced with contextual ones, we can assign a definite fixed form to the production rules:

  • 0L:
  • 1L: (left context), (right context)
  • 2L:
  • ...

Contextual rules introduce a problem of boundary conditions. These can be handled in a number of ways. Consider a 2L system where every rule is fully contextual, in the form a < b > c -> d. In the middle of the word, getting the context is simple. But at the very start and very end of the word, there is no context to retrieve. This can be solved in multiple ways. Execution can start at the second character, and stop at the second to last, with the very first and very last characters copied over by the implicit identity rules. It could also be that contextual rules at the very end will grab a boundary symbol instead of nothing. These rules are equivalent and very similar semantically. Another possibility is that the string "wraps around" to the other side, so a contextual rule at the very end will grab the very first symbol for its right context.

Computational class

The adult languages of 0L and 2L systems relate closely with the grammar hierarchy. If we consider all the adult languages of a given type (that is, every possible halting word of every possible L-system of that type) then it's the case that 0L's adult languages are the context-free languages, propagating 2L models context-sensitive languages, and 2L models recursively enumerable languages. In terms of automata, 0L models nondeterministic pushdown automata, propagating 2L models nondeterministic linear bounded automata, and 2L models nondeterministic Turing machines.[1]

Actually implementing/translating grammars into L-systems can be done like so. For the 0L case it is as simple as having the two symbol sets of our grammar, nonterminals and terminals. Rules taking in nonterminals map to nonterminals, nonterminals with terminals, or terminals. Rules taking in terminals map to themselves. This works in general for any CFG, since the sequential processing order does not have an effect on the rewriting.[1]

For CSG and unrestricted grammars the execution/replacement order matters, as it can cause a chain of information transfer. In order to ensure the word is rewritten left to right, a moving head can be used. The head is implemented by duplicating the symbol sets multiple times and attaching an arrow to all of the symbols in the duplicate sets. There is a set for the normal symbols, one with a right arrow, one with a left arrow, and one with a long left arrow. This head starts at the left, and moves along. Symbols under the head are allowed to rewrite. Once the head reaches the end, it starts to move back, also allowing rewrites. This process repeats until the word consists of only terminals.[1]

It is feasible to construct a similar device in a 1L system, but only if data can somehow propagate in both directions. If the system is able to have rules with left context, and rules with right context, but not both, then the head symbol set can be duplicated to produce even and odd symbols. Even symbols always map to odd symbols and odd symbols always map to even, with terminals still mapping to themselves. Left context operations only operate on even, and right context only operate on odd, or vice versa. This should allow for the left-to-right operation of the rewriting, while only being 1L. Another option is to use a different boundary condition, like wrapping, which should allow for just left or just right context.

Note that this is for the proper adult languages of the systems. If we define a different halting criterion or decoding scheme, the power of the models can be different. For instance, a Turing machine can be implemented in propagating 2L, but it is unable to shrink the tape for its answer. If we allow a special symbol which is removed from the start, end, or both, of the final adult word, then propagating 2L is Turing complete.

L-systems may be further restricted resulting in differing class distinctions. Restrictions, like disallowing nonterminals (symbols which don't appear in adult words), can make L-systems less powerful.[4]

Examples

Python implementation (D0L)

This Python implementation can be used for any deterministic context free L-system.

def step_0L(word, productions):
    new = ""
    for s in word:
        new += productions.get(s, s)
    return new

# example: Fibonacci sequence
rules = {
    "a": "b",
    "b": "ab",
}

word = "a"
for i in range(6):
    print(i, word)
    word = step_0L(word, rules)

Python implementation (stochastic 0L)

This step function supposes that the productions map to sets of possible outputs. It chooses which to produce randomly with equal weighting.

import random

def step_0L_stochastic(word, productions):
    new = ""
    for s in word:
        r = productions.get(s, s)
        new += random.choice(r)
    return new

# example: (syntactically) valid brainfuck programs
rules = {
    "P": [
        "[P]P", # nesting
        ">P", "<P", "+P", "-P", ".P", ",P", # normal commands
        "", # production head deletion
    ],
}

word = "P" # begin with a single production head
for i in range(100):
    print(i, word)
    n = step_0L_stochastic(word, rules)
    # stop if adult
    if word == n:
        break
    word = n

Python implementation (D2L)

This Python function implements the general deterministic 2L step function. Rules can be in the form 'xay': 'b' for the context x and y around a, '.ay': 'b' for just right context, 'xa.': 'b' for just left context, and '.a.': 'b' or 'a': 'b' for context free. A dual context rule for a situation is preferred, then right context, then left, then noncontextual. The boundary symbol is space.

def step_2L(word, productions, boundary=' ', any='.'):
    new = ""
    for i, s in enumerate(word):
        # get the context
        valid = lambda i: 0 < i and i < len(word) 
        pre = word[i - 1] if valid(i) else boundary
        post = word[min(i + 1, len(word) - 1)] if valid(i + 1) else boundary
        # try to get the rule for the context
        full = productions.get(pre + s + post)
        right = productions.get(any + s + post)
        left = productions.get(pre + s + any)
        free = productions.get(any + s + any) or productions.get(s)
        # prefer the most context, with default identity
        new += (full or right or left or free or s).replace("E", "")
    return new

# example: signal propagation
rules = {
    "ba.": "b",
    "b": "a",
}

word = "baaaa"
for i in range(6):
    print(i, word)
    word = step_2L(word, rules)

See also

References

  1. 1.0 1.1 1.2 1.3 1.4 Walker, A. (1974). Adult languages of L systems and the Chomsky hierarchy. In: Rozenberg, G., Salomaa, A. (eds) L Systems. Lecture Notes in Computer Science, vol 15. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-06867-8_16
  2. 2.0 2.1 2.2 2.3 2.4 Santell, J. 2019, December 9. L-systems. https://jsantell.com/l-systems/
  3. Krivochen, D., Saddy, D. 2018, Sep 27. Towards a classification of Lindenmayer systems. arXiv. arXiv:1809.10542 https://doi.org/10.48550/arXiv.1809.10542
  4. Xie, J. 2024, April. Lindenmayer Systems and Their Complexity: a Bridge Between Computer Science and Developmental Biology. Proceedings of the 39th Annual Conference of the Pennsylvania Association of Computer and Information Science Educators. https://granite.sru.edu/~pacise/pacise2024/PACISE_L_Systems.pdf