Talk:Emmental

From Esolang
Jump to navigation Jump to search

I've written an interpreter in Ocaml, which seems to be working properly, or at least be consistent with the understanding I have of Emmental. I've only tested it on:

  • echo 'Hello, World!' | ./emmental ',,,,,,,,,,,,,,..............' (pseudo reverse cat - haven't bothered to try custom loops yet),
  • echo 'Hello, World!' | ./emmental ';#44#94#44!;#118#46#46!,,,,,,,,,,,,,,..............' (pseudo cat - haven't bothered to discard trash from stack),
  • #59#35#55#56#46#!;##1!;##2!;##3!;##4!;##5!;##6!;##7!#59#35#56#57#46#8!,#77-~? (indeed outputs Y or N according to input),
  • ;#35#52#56#63#48!0 (unfortunately raises Stack_overflow very quickly).

My personal opinion: Emmental's main feature comes from !, yet using ! results in hard to read programs because of the whole ASCII thing. I really think a quote or quote-mode instruction would benefit the language, and I can see a third way to look at it (third after the two at [1] Design Decisions). Consider:

The symbols could be taken from any finite alphabet, but in Emmental, to keep things relatively simple, we'll just use the ASCII character set. (Actually, to be precise, this is the full 8-bit ASCII character set. Non-printable control characters are allowed, as are characters between 128 and 255, and each has a distinct meaning. But their representations are not defined.)

Well, my suggestion is quite simple: the two-char sequence 'a, where a is any (printable) ASCII character, is in fact an escape sequence representing the character (a + 128); and the initial definition of characters 128-255 is "push matching 0-127 symbol onto stack".

(Note however that I do like the idea of alternative MCIs...)

I also have a question: where does the idea of v and ^ being so asymmetrical come from? Since you were so kind as providing a duplicate instruction, wouldn't it be more convenient for ^ to pop the stack?

Last question: apparently the haskell interpreter on your site is copyrighted. Would it still be ok for me to make my interpreter available somewhere?

--Koen (talk) 14:25, 17 October 2012 (UTC)

OK, I've started looking into making loops. After several unsuccessful attempts that made my head hurt, I found a surprisingly easy solution. It is likely that the way I understand ? and ! is not the right one. Here's a cat program (actually an infinite loop):
echo 'Hello, World!' | ./emmental ';#44#46#35#52#50#63#42!*'
Hello, World!
Fatal error: exception End_of_file
In a more readable way: ; ', '. ''* '? '* ! * (where 'x means "push x" and ''x means "push the code that pushes x"). Basically it defines * to "take one char from input and output it, then make a recursive call". Note the difference between '* alone, which would refer to a previous (and here nonexistent) definition of *, and ''* '?, which is recursive (and that's the part where my understanding is probably not what was meant). --Koen (talk) 16:32, 17 October 2012 (UTC)
While I'm at it: about using ~ for conditionals, you say "The most severe constraint is that there be 9 available symbols to act as "destinations" for our "branch".". I disagree - by using ~ once, you get a value between 0 and 7 if the original number was not 0, and 8 if it was 0. By adding 248 to that value, you get 256 if the original number was 0 (and log 256 is 8), and a value in the range 248-255 if nonzero (and their log is always 7). Thus with the combination ~#248+~ (or ~#8-~, whatever) you only need two symbols per branching (and don't need to rewrite seven times the code for nonzero values). --Koen (talk) 18:27, 17 October 2012 (UTC)
That is indeed a cat program; I tested it on the reference interpreter. I'll add it to the examples section. It should not be difficult to concoct a truth-machine either, if anyone wants to try that. Chris Pressey (talk) 10:25, 19 October 2012 (UTC)
I suddenly remembered that the whole point of your parity program was to avoid using the base-2 log, and that made me feel stupid about my congruences with 3 (though the nested function definitions and resulting 27-char long sequence to delay a push was kind of fun and twisted). I have an idea for the same program *without* log2, using a loop to redefine *all* symbols (overwriting all initial instructions in the process). I'm busy at the moment and will probably implement that next week; I'm pretty sure it would be working, though it's possible that I'd need to reserve one symbol for the execute instruction (in which case, the program would only work for an input, say, in the range 0-254). I think this can be solved with a clever play on the early/mate binding thing but I'm not sure. Here is a quick sketch for the program:
  • define chars 0, 1 and 2 to mean "push the digit 0/1/2 on the stack then output it"
  • define char 3 to mean "subtract 3 from top stack element, duplicate it, execute the copy"
  • define char 4 to mean "pop a symbol and define it to be a synonym of char 3"
  • define char 255 to mean "define char 255 to be a synonym of char 3, then take user input, duplicate it, and execute the copy"
  • use a loop to push chars 4-254 on the stack (I think it's mandatory that char 4 be on bottom)
  • apply instruction 4 to all chars on the stack (still have to find a clever way to emulate the loops' "test condition")
  • 255
Note that the two loops involved are sort of "for" loops, so they probably can be replaced by "define char 6 to mean 'char 4 char 4 char 4 char 4 char 4 char 4 char 4 char 4 char 4 char 4', define char 5 to mean 'char 6 char 6 char 6 char 6 char 6', define char 4 to mean 'char 5 char 5 char 5 char 5 char5'" or something of the like to iterate char 4 250 times (and pushing chars on the stack can be done by repeating "duplicate top element and increment it").
Of course you could argue that erasing the whole instruction set just to emulate one branching may not be of much use in bigger programs... But remember that if you don't need to consider *all* symbols in your branching, then you can temporarily save your instruction set in the remaining symbols. It may even be possible (and fun!) to perform an instruction-set compression by keeping some instructions to push symbols on the stack, and one instruction to branch on that symbol and perform the corresponding instruction - hopefully that branching would not be too destructive. --Koen (talk) 00:59, 21 October 2012 (UTC)
This is so ridiculous. I had written all of the above in Emmental... except I can't use char 255 at the end of the program to trigger input and start the final 'loop'... because char 255 IS NOT PRINTABLE. --Koen (talk) 20:13, 21 October 2012 (UTC)
It's not quite next week yet, and it's still buggy, but here's the proof that "branching" can be done without the discrete log:
Readable version Emmental code
; '#'4'8 '. #!
; '#'4'9 '. #1!
; '#'5'0 '. #2!

; '#'3'-':'? #3!
; '^':'-'#'5'9'+'#'3'v'! #4!
;
  '#'5'9 '#'3 '#'2'5'5'!
  ',':'?
#255!

; #4#4#4#4#4#4#4#4#4#4 #5!
; #5#5#5#5#5 #5!
; #5#5#5#5#5#4 #4!

#4
; ':'#'1'+ 'd!
; 'd'd'd'd'd'd'd'd'd'd 'd!
; 'd'd'd'd'd 'd!
; 'd'd'd'd'd 'd!
; 'd #4 #255 'd!
d
; #35#52#56#46 #!
; #35#52#57#46 #1!
; #35#53#48#46 #2!

; #35#51#45#58#63 #3!
; #94#58#45  #35#53#57 #43#35#51 #118#33  #4!
;
  #35#53#57 #35#51 #35#50#53#53#33
  #44#58#63
#255!

; #4#4#4#4#4#4#4#4#4#4 #5!
; #5#5#5#5#5 #5!
; #5#5#5#5#5#4 #4!

#4
; #58#35#49#43 #100!
; #100#100#100#100#100#100#100#100#100#100 #100!
; #100#100#100#100#100 #100!
; #100#100#100#100#100 #4#255 #100!
d
Apparently the wikitable is buggy as well... --User:Koen 78.245.243.132 21:40, 21 October 2012 (UTC)
Oh, wait. The program is in fact working properly. Note however that the massive symbols redefinition affects whitespace as well; for that reason it is very important for the d at the end to be the very last character in the program. --Koen (talk) 22:05, 21 October 2012 (UTC)
I think our edits collided and I accidentally deleted your above addition. After some undo'ing, I'll repeat myself:
My Mascarpone interpreter is copyrighted, yes, but it's a BSD license, so you can copy it, modify it, sell it, etc.
Just because my interpreter is copyrighted doesn't mean someone else can't implement the language too and release their implementation under whatever terms they want. (Unless their interpreter was directly derived from mine, e.g., if you just blindly translated the Haskell to OCaml... in that case I think it would be a derivative work, and thus under the same BSD license.)
For ideas on the language itself: have you seen Mascarpone? It doesn't have an article on this wiki, possibly because I didn't consider it esoteric at the time (but how could I not! look at that syntax!) so that was probably an oversight...
Also I doubt I will be able to remember clearly why v and ^ are asymmetrical. But if it does come back to me, I'll let you know.
And, heck, I can probably just put Emmental into the public domain if it makes things easier. Chris Pressey (talk) 08:51, 18 October 2012 (UTC)
OK, the Emmental distribution is now in the public domain: https://github.com/catseye/Emmental Chris Pressey (talk) 12:14, 18 October 2012 (UTC)
No, I did not take a look at your implementation before having finished my interpreter (and even then, I don't really speak Haskell at all, so that "look" was not very effective). I hadn't heard of Mascarpone (how could that not be esoteric?), but I'm looking at it right now. I already have a few questions... --Koen (talk) 13:05, 18 October 2012 (UTC)

Computational class

However, if it turns out that we need "duplicate" or "discard" in order to write routines that can handle a variable-sized queue — and that strikes me as likely — then it looks like we have a severe problem.

Emmental's queue may be somewhat hard to use - so is Qdeql's, and other people have already bothered proving it TC. So I wonder if we could convert Qdeql programs into Emmental. In fact I think I almost have a solution, unfortunately it needs to affect distinct symbols to nested loops, and since the number of symbols is finite, that particular conversion would not allow for more than a certain level of nesting (well there are 256 symbols available in total, so in practice it should be easy, but theoretically I'm not sure a finite number is enough). --Koen (talk) 13:19, 18 October 2012 (UTC)

Well, ais523 just explained to me why arbitrary deep nested loops were usually not required for Turing-completeness - this still has to be proven in the particular case of Qdeql, though... But thinking about it again, I realize it's probably not necessary at all to use distinct symbols for nested loops. Please correct me if I'm wrong.
; ': '- '+ '$ !          (discard)
; 'v '^ '$ 'N !          (qdeql next)
; 'v '#'1 '- '^ '$ 'D !  (qdeql dec)
; ', '^ '$ 'I !          (qdecl input)
; 'v '. 'O !             (qdecl output)
; 'v ': '~ '#'8'- '~ '#'9'0'+ '? 'B !       (qdecl begin)

; '(BODY OF LOOP) 'X !
; '^ ':'-'^'^ '+ 'X ''B '? 'a !    (enqueue it again followed by two zero bytes, X, then begin again)
(reserve b as no-op)
For readability purposes, I adopted the convention that 'A means #65, and ''A means '#'6'5, and (comment) is a comment.
So here is how it works. The program begins with definitions for N, D, I, O, B, so that we can use them later in the program. NDIO are quite simple - they do what they're supposed to. Then we have loops, which are slightly more complicated. What B does: it dequeues one element, duplicates it, destructively tests whether the copy is zero or not with ~#8-~ #90+ - if the copy was zero, then the top element is now 98, otherwise it's 97. 98 is ascii code for b, so b must always be reserved as a no-op. 97 is a, so a must be defined to enqueue the previously dequeued element again, followed by two zeroes; then the body of the loop is executed (X), and then begin (B) is called once again for the next iteration.
This means that N, D, I, O, B can be defined at the beginning of the program (it actually doesn't matter, you could also use their definitions manually every time, but that wouldn't be very Emmentalic), but a cannot, it must be redefined after every definition of the loop body. The reason is that it uses directly the instruction X to execute the loop body, rather than pushing the symbol X onto the stack, then executing it. The reason is that executing the symbol X with ? would use the *current* definition for X, which after the first iteration might have been overwritten by a nested loop. Executing the instruction X directly, on the other hand, uses its definition from the time when a was defined, which was before the loop even started. Note that B does use ? to execute the symbol a, which means B can be defined once and for all at the beginning of the program.
A Qdeql program can then be translated into an Emmental program as follow: start by defining N, D, I, O, and B as above; then translate the Qdeql code as
Qdeql Emmental
=
N
-
D
&
I
*
O
\ body of loop /
; '(body of loop) 'X !
; '^ ':'-'^'^ '+ 'X ''B '? 'a !
B

where '(body of loop) means "the sequence of symbols that push the sequence of symbols that constitute the body of loop".

Also the reason why I put that here rather than directly on the Emmental page is because I haven't tested it yet, and also wanted a second opinion on it first.
--Koen (talk) 00:04, 19 October 2012 (UTC)
Ah, well. Forgot to account for "If the queue is empty, any instruction that would normally use a value from the queue instead uses 0.". I don't think there's any way of finding whether the queue is empty is Emmental. Just to be able to test the conversion, here is the hello world program [2] converted to Emmental using this method, with additional definitions for E (decrement ten times), F (decrement one hundred times), and _ (enqueue a zero). I have tested it and it works properly. That's just a hello world program, though, so the loop in it is probably not such a good example.
;#58#45#43#36!
;#118#94#36#78!
;#118#35#49#45#94#36#68!
;#44#94#36#73!
;#118#46#79!
;#118#58#126#35#56#45#126#35#57#48#43#63#66!
;#68#68#68#68#68#68#68#68#68#68#69!
;#69#69#69#69#69#69#69#69#69#69#70!
;#35#94#95!

_FEEEEEDD O
_FEEEEEDDDDD O
_D
;#78#88!
;#94#58#45#94#94#43#88#35#54#54#63#97!
B
D
FFEEEEEEEEEDDDD OO
_FEEEEDDDDD O
_FFEEDDDD O
_FEEEDDDDDDD O
_FEEEEDDDDD O
_FEEEEDD O
_FEEEEDDDDDDDD O
_FEEEEEDDDDDD O
_FFEEEEDDDDDD O

--Koen (talk) 01:48, 19 October 2012 (UTC)

Looks like you've made distinct progress (I'm not entirely well equipped to follow it in great detail.) Qdecl is not the easiest language itself to have been shown TC, but, it does look like the problem there with the queue and the problem here with the queue are pretty similar. Does Qdecl's TC proof actually rely on the "If the queue is empty, any instruction that would normally use a value from the queue instead uses 0." property? It looks like it might, but it's really hard to tell. If it doesn't, you wouldn't have to bother simulating that property in Emmental. Chris Pressey (talk) 10:15, 19 October 2012 (UTC)
You need that property to get something on the queue to start with, so in some sense it is necessary. However all you need to translate is some way to get the initial 0 + 7n*255 pattern built which the rest uses. (The Haskell program has a general function for making a Qdeql setup for any length of those, starting with an empty queue—of course the first command in that must use the property.) The brainfuck translation certainly doesn't empty the queue again after that. --Ørjan (talk) 12:22, 19 October 2012 (UTC)
As I recall, the Qdeql translation of a fixed cell brainfuck program won't use more than the nesting depth of the original brainfuck program + a constant. Which I guess transfers the question to fixed cell brainfuck - my 3-cell construction certainly nests deeply, although there will at the end still be some limit on what is needed to implement a given universal turing machine. For something like 5 or 6 cells, I expect you can use a different control flow which uses cell flags instead of deeply nested branching, and so doesn't need more than a small constant nesting depth. --Ørjan (talk) 12:22, 19 October 2012 (UTC)

Talk:Mascarpone

Moved to Talk:Mascarpone. Chris Pressey (talk) 15:40, 9 December 2012 (UTC)