*brainfuck

From Esolang
Jump to navigation Jump to search

*brainfuck is a tape-based esoteric programming language designed in september 2012 by User:Koen. It is identical to brainfuck, except that the instructions > and < have been replaced by an extensive use of pointers. The name is read "pointed brainfuck" and is inspired by the operator * from the language C.

Instructions

*brainfuck has six instructions, which are identical to their brainfuck equivalent. However, there is no data pointer, and thus every instruction must have one argument, a cell reference.

Command Description
+ Increment the referenced cell
- Decrement the referenced cell
. Output the character signified by the referenced cell
, Input a character and store it in the referenced cell
[ Jump past the matching ] if the referenced cell is 0
] Jump back to the matching [ if the referenced cell is nonzero

Note that the behaviour of . when the referenced cell is not a valid ASCII code is undefined. The behaviour of , when end of file is reached is also unspecified; see brainfuck#EOF for more details.

Cell references

*brainfuck operates on an array of memory cells, the tape. The tape is unbounded to the right, and every cell can contain an arbitrarily large nonnegative number. Cells are numbered, the leftmost cell being number 0. A program in *brainfuck is a sequence of numbers and instructions. The argument of an instruction is the last number before it. Those nonnegative numbers represents cells to which apply the instructions:

  • The number 0 represents the cell number 0
  • For every n > 0, the number n represents the cell which number is contained in the cell the number n - 1 would have represented.

That is, a number indicates the number of times the cell 0 must be dereferenced. For instance, if the tape is [3, 0, 8, 42, ...], then the number 0 represents cell 0, the number 1 represents cell 3 (because there is a 3 in cell 0), the number 2 represents cell 42 (because there is a 42 in cell 3), and so on.

The argument of an instruction is the last number before it; for example, the code "n++m-.," is strictly equivalent to "n+n+m-m.m," (with n and m being numbers). The instruction ] is an exception: its argument is the same as its matching [. This can also be seen as the instruction ] having no argument: it causes an unconditional jump to the matching [, which in turn decides whether to continue in the loop or jump past the ]. Note that in the case of loops, what "the last number before" [ or ] means is implementation-dependant; for instance the program 0+[1] might be equivalent to 0+[] on certain implementations (where "the last number before" means "the nearest number to the left of the instruction"), in which case the argument for [ 0, and cell 0 holds a 1, thus it is an infinite loop; or it can halt after one iteration on others (where "the last number before" means "the last number read by the instruction pointer"), because the argument for [ becomes 1, which here refers to cell 1, which contains a 0.

To aggravate the confusion, numbers must be written in binary, using the symbols > for 0 and < for 1.

Computational class

Any program in brainfuck can trivially be "translated" into an equivalent *brainfuck program, using cell 0 as the only data pointer; to do so, start the program with >+ so that cell 0 initially points to cell 1, then translate instructions as follow:

brainfuck *brainfuck
> >+
< >-
+ <+
- <-
. <.
, <,
[ <[
] ]

Note that this can be optimized by omitting redundant > and < in the *brainfuck code: only one occurrence of > is needed before each sequence of translated ><, and only one occurrence of < is needed before each sequence of translated +-.,[.

However, such a vulgar translation is barely a mean of proving that *brainfuck is equivalent to brainfuck, and should never be seen in practice, as using only the first cell as a data pointer is a sign of misunderstanding of the unlimited potential provided by the *brainfuck programming language, in which every cell is a pointer!

Implementation

The following is an interpreter for *brainfuck, written in Ocaml. The tape is bounded (but the number of cells can be modified at the top of the source code), and cells contain Ocaml integers - numbers that are negative or bigger than the tape's size will produce an error when used as pointers, however. This interpreter considers that end of file does not affect the cell's content, though this can be changed easily, and that "the last number before" an instruction is always "the last number read by the instruction pointer" - if the program begins with an instruction instead of a number, the argument is considered to be 0 by default. It has been tested on the cat program >,[.[-]>,].

let tape_size = 10000

let tape = Array.create tape_size 0

let rec deref n =
  if n = 0 then 0 else tape.(deref (pred n))

let p = Sys.argv.(1)
let length = String.length p

exception Unmatched_beginloop
exception Unmatched_endloop

let jump =
  let rec jump_ acc i =
    if i = length then raise (Unmatched_beginloop)
    else match (p.[i], acc) with
      | ('[', _) -> jump_ (succ acc) (succ i)
      | (']', 0) -> succ i
      | (']', _) -> jump_ (pred acc) (succ i)
      | _ -> jump_ acc (succ i)
  in
  function i -> jump_ 0 (succ i)

let dejump =
  let rec jump_ acc i =
    if i < 0 then raise (Unmatched_endloop)
    else match (p.[i], acc) with
      | (']', _) -> jump_ (succ acc) (pred i)
      | ('[', 0) -> i
      | ('[', _) -> jump_ (pred acc) (pred i)
      | _ -> jump_ acc (pred i)
  in
  function i -> jump_ 0 (pred i)

let rec process i n t =
  if i < length then
    match (p.[i], t) with
    | ('>', true) -> process (succ i) (2 * n) true
    | ('>', false) -> process (succ i) 0 true
    | ('<', true) -> process (succ i) (2 * n + 1) true
    | ('<', false) -> process (succ i) 1 true
    | ('+', _) ->
      let k = deref n in
      tape.(k) <- tape.(k) + 1;
      process (succ i) n false
    | ('-', _) ->
      let k = deref n in
      tape.(k) <- tape.(k) - 1;
      process (succ i) n false
    | ('.', _) ->
      print_char (char_of_int tape.(deref n)); flush stdout;
      process (succ i) n false
    | (',', _) ->
      (try tape.(deref n) <- int_of_char (input_char stdin) with
       | End_of_file -> ());
      process (succ i) n false
    | ('[', _) ->
      if tape.(deref n) = 0 then process (jump i) n false
      else process (succ i) n false
    | (']', _) -> process (dejump i) n false
    | _ -> process (succ i) n false;;

process 0 0 false;;

To go further

In *brainfuck, every cell is identified by a nonnegative number, and every cell contains a nonnegative number; thus any cell is a pointer to another cell. But to dereference those pointers, *brainfuck uses nonnegative numbers... By that logic, we could imagine a new language, where every cell is a dereference operator!