ppencode

From Esolang
Jump to navigation Jump to search

ppencode is

  1. a subset of Perl who restricts source code to have only Perl keywords, defined by Yoshino TAKESAKO in 2005,
  2. a Perl program to generate a ppencode program that outputs a given string or an encoding method to represent a text with ppencode program written by Yoshino TAKESAKO in 2005, and,
  3. a translator to convert arbitrary Perl program into ppencode program designed and published by Shinjiro Hamaji in 2015, which is also known as ppencode 2.

Origin

Can be seen here: https://www.perlmonks.org/index.pl?node_id=290607

Definition of Perl keywords

TAKESAKO defined syntax keywords, operators, and builtin functions who have lowercased alphabets only to be Perl keyword for ppencode; in other words, these are valid 38+182=220 ppencode tokens:

lt gt le ge eq ne cmp not and or xor if else elsif while for foreach continue goto last local map
my next redo require return use tr y s m q qq qr qw qx x

abs accept alarm bind binmode bless caller chdir chmod chomp chop chown chr chroot
close closedir connect cos crypt dbmclose dbmopen defined delete die do dump each eof eval
exec exists exit exp fcntl fileno flock fork formline getc getlogin getpeername getpgrp getppid
getpriority getpwnam getgrnam gethostbyname getnetbyname getprotobyname getpwuid
getgrgid getservbyname gethostbyaddr getnetbyaddr getprotobynumber getservbyport
getpwent getgrent gethostent getnetent getprotoent getservent setpwent setgrent sethostent
setnetent setprotoent setservent endpwent endgrent endhostent endnetent endprotoent
endservent getsockname getsockopt glob gmtime grep hex import index int ioctl join keys kill
lc lcfirst length link listen localtime log lstat mkdir msgctl msgget msgrcv msgsnd no oct open
opendir ord pack pipe pop pos print printf push quotemeta rand read readdir readlink recv ref
rename reset reverse rewinddir rindex rmdir scalar seek seekdir select semctl semget semop
send setpgrp setpriority setsockopt shift shmctl shmget shmread shmwrite shutdown sin sleep
socket socketpair sort splice split sprintf sqrt srand stat study substr symlink syscall sysread
system syswrite tell telldir tie time times truncate uc ucfirst umask undef unlink unpack untie
unshift utime values vec wait waitpid wantarray warn write

You may notice that atan2, a builtin function, is not in; this is because it has a digit, which is invalid for ppencode.

On the other hand, shinh, the author of ppencode 2, defines keywords shall be 248. Here are new 28 tokens:

atan break default elseif evalbytes fc flags format given lock order our package precision
prototype readline readpipe say size state sub sysopen sysseek tied unless until vector when

However, the following six tokens are not built in Perl 5.34.0.

atan flags order precision size vector

Additionally, these eight tokens are not available unless specified features are enable:

break default evalbytes fc given say state when

Syntax

It is not explicitly defined, but TAKESAKO's ppencode outputs the program as in following syntax:

program = *EMPTY* | "#!/usr/bin/perl -w\n" keywords " "
keywords = keyword | keywords " " keyword 

The program generated by ppencode 2 is represented by as follows:

program = keyword | program " " keyword

It is common that

  1. a ppencode program shall have only a line, and
  2. a ppencode program is represented as a sequence of keywords, delimitated by exactly one space character (which is 0x20 in ascii).

So this article convertly follows the syntax.

Tricks

ppencode

Here is an example output of input "Hi" to the ppencode script, conveniently inserting line-breaks and comments:

# a dummy
# each function is separated by a keyword "and" instead of semicolon
length q pop and
# prints 'H'
# "q chr lc" equals "hr l" in Perl
# "chr ord" is used to get the first character of a string
print chr ord uc q chr lc and
# prints 'i'
print chr ord q tie lt 

Here is an example output against input "=":

length q my m and
# since "=" is not an alphabet, it must be generated from several functions
print chr oct oct ord q eq ge 

Here is an example to output "あ" in shift-jis encode (i.e. 0x82 0xa0):

length q chdir exec and
# ord uc q qr q: 82
# hex 82: 130
# chr 130: "\x82"
print chr hex ord uc q qr q and
# length q q ... q: 160
# chr 160: "\xa0"
print chr length q q setservent symlink eof gethostent rename join getprotobyname time getpwuid waitpid dbmopen getgrgid printf getpwnam getgrgid getsockopt socket rename send ref q 

ppencode 2

The overall of generated program is:

# generate a string to be evaluated:
# '$_="";eval "vxxx.xxx.xxx.xxx"', where vxxx.xxx.xxx.xxx is version number
# eval it

It is a point that xor is used instead of ;, because unlike and and or, it always evaluate both operands from left to right.

Links

See also