From Esolang
Jump to navigation Jump to search

Is this intended as a collaboration that anyone can contribute to, rather than something only you are working on? That's what the works in progress page is for. —ehird 17:46, 6 May 2011 (UTC)

Yes, anyone can contribute if you have ideas. --Zzo38 19:35, 6 May 2011 (UTC)

There are a few differences between different dc implementations, enough that it's possible to write a dc program that can detect which implementation is being used. This seems to be based on GNU dc, but there's also NetBSD/OpenBSD dc, classical dc (Heirloom Project, Commercial Unix, OpenSolaris, Plan 9, and UCB BSD), and dc.sed (included with super-sed), which have different extensions (for example, Y for classical dc, a and r for GNU dc, and J and N for NetBSD/OpenBSD dc), and also differ in the behavior of other commands like x (some execute a number by simply pushing it back, others try to treat the internal representation of the number as a string, and others print an error message) and P (some use base-100, some use base-256, and some print strings literally but use decimal for numbers). GNU dc is strongly-typed ("dc: non-numeric value", "dc: eval called with non-string argument", "dc: garbage in value being duplicated"), classical dc lets you use any operation on any data type and tries to interpret its internal encoding as that type (ASCII for strings, an array of pointers for arrays, and base-100 for numbers), and dc.sed treats non-numeric strings as 0 in arithmetic operations. In dc.sed and GNU dc, Z on a string replaces it with its length in characters, but in classical dc, the length of a string is obtained by interpreting it as a number and finding the number of digits. GNU dc also limits input bases ("dc: input base must be a number between 2 and 16 (inclusive)") and uses bignum output bases (though not negative or unary), while classical dc does the opposite, limiting output bases ("output base is too large") and using bignum input bases (including negative and unary, but only _0-9A-F. can be used for input, for example, 11 in base-60 is 61 decimal). Some of them allow negative bases (either input, output, or both), and dc.sed also allows fractional bases. In dc.sed, arrays and register stacks are the same thing, but in the others, an array is a single object that can be pushed onto the stack and stored into a different register (so 4 5:aLas.4;ap prints 4 in dc.sed, 0 in classical dc, and "dc: stack register 'a' (0141) is garbage" followed by an abort() in GNU dc, which I think is a bug). NetBSD/OpenBSD dc also has a -x command-line option to use 2-character register names instead of 1-character, and its bc has an actual symbol table that maps variable names into 16-bit register codes. I just thought this information would be useful because all 4 dc implementations are open-source and GPL/BSD/freely-licensed, so if different people use different ones as the basis for their TEdcAL, they should know that there are incompatible differences.

As far as ideas go, I think a substring command of form (string start length -- substring) and the requirement that Z returns the length of a string in characters would be nice, but I guess using H and arithmetic would let you write a substring function (but since different implementations have incompatible P data formats, you'd need to document the particular one you want for TEdcAL), and combined with string + any string operation is possible. I actually wrote a combinatory logic implementation for a modified dc.sed with these two new commands (in dc.sed, P stores the partial line in a special buffer since sed can't print a partial line):

g	push entire P buffer onto stack and clear P buffer (push empty string if empty)
G	push first character from P buffer onto stack and cut it from the P buffer (do not alter stack if empty)

These, combined with string = are enough to perform any possible string operation. By using the P buffer as a queue for string operations, I put functions in various registers and used x as the apply operation, and made an almost-Unlambda 1.0 interpreter (everything but c and d). For example, the B-combinator S(KS)K is lklslkxlsxx, which when evaluated returns a lambda which performs the function of that combinator. I'll definitely port it over to TEdcAL when you make an interpreter for it. Also, have you checked out the OpenBSD dc extensions? Ian 21:51, 24 May 2011 (UTC)