ppencode
ppencode is
- a subset of Perl who restricts source code to have only Perl keywords, defined by Yoshino TAKESAKO in 2005,
- a Perl program to generate a ppencode program that outputs a given string or an encoding method to represent a text with ppencode program written by Yoshino TAKESAKO in 2005, and,
- a translator to convert arbitrary Perl program into ppencode program designed and published by Shinjiro Hamaji in 2015, which is also known as ppencode 2.
Origin
Can be seen here: https://www.perlmonks.org/index.pl?node_id=290607
Definition of Perl keywords
TAKESAKO defined syntax keywords, operators, and builtin functions who have lowercased alphabets only to be Perl keyword for ppencode; in other words, these are valid 38+182=220 ppencode tokens:
lt gt le ge eq ne cmp not and or xor if else elsif while for foreach continue goto last local map my next redo require return use tr y s m q qq qr qw qx x abs accept alarm bind binmode bless caller chdir chmod chomp chop chown chr chroot close closedir connect cos crypt dbmclose dbmopen defined delete die do dump each eof eval exec exists exit exp fcntl fileno flock fork formline getc getlogin getpeername getpgrp getppid getpriority getpwnam getgrnam gethostbyname getnetbyname getprotobyname getpwuid getgrgid getservbyname gethostbyaddr getnetbyaddr getprotobynumber getservbyport getpwent getgrent gethostent getnetent getprotoent getservent setpwent setgrent sethostent setnetent setprotoent setservent endpwent endgrent endhostent endnetent endprotoent endservent getsockname getsockopt glob gmtime grep hex import index int ioctl join keys kill lc lcfirst length link listen localtime log lstat mkdir msgctl msgget msgrcv msgsnd no oct open opendir ord pack pipe pop pos print printf push quotemeta rand read readdir readlink recv ref rename reset reverse rewinddir rindex rmdir scalar seek seekdir select semctl semget semop send setpgrp setpriority setsockopt shift shmctl shmget shmread shmwrite shutdown sin sleep socket socketpair sort splice split sprintf sqrt srand stat study substr symlink syscall sysread system syswrite tell telldir tie time times truncate uc ucfirst umask undef unlink unpack untie unshift utime values vec wait waitpid wantarray warn write
You may notice that atan2
, a builtin function, is not in; this is because it has a digit, which is invalid for ppencode.
On the other hand, shinh, the author of ppencode 2, defines keywords shall be 248. Here are new 28 tokens:
atan break default elseif evalbytes fc flags format given lock order our package precision prototype readline readpipe say size state sub sysopen sysseek tied unless until vector when
However, the following six tokens are not built in Perl 5.34.0.
atan flags order precision size vector
Additionally, these eight tokens are not available unless specified features are enable:
break default evalbytes fc given say state when
Syntax
It is not explicitly defined, but TAKESAKO's ppencode outputs the program as in following syntax:
program = *EMPTY* | "#!/usr/bin/perl -w\n" keywords " " keywords = keyword | keywords " " keyword
The program generated by ppencode 2 is represented by as follows:
program = keyword | program " " keyword
It is common that
- a ppencode program shall have only a line, and
- a ppencode program is represented as a sequence of keywords, delimitated by exactly one space character (which is 0x20 in ascii).
So this article convertly follows the syntax.
Tricks
ppencode
Here is an example output of input "Hi" to the ppencode script, conveniently inserting line-breaks and comments:
# a dummy # each function is separated by a keyword "and" instead of semicolon length q pop and # prints 'H' # "q chr lc" equals "hr l" in Perl # "chr ord" is used to get the first character of a string print chr ord uc q chr lc and # prints 'i' print chr ord q tie lt
Here is an example output against input "=":
length q my m and # since "=" is not an alphabet, it must be generated from several functions print chr oct oct ord q eq ge
Here is an example to output "あ" in shift-jis encode (i.e. 0x82 0xa0):
length q chdir exec and # ord uc q qr q: 82 # hex 82: 130 # chr 130: "\x82" print chr hex ord uc q qr q and # length q q ... q: 160 # chr 160: "\xa0" print chr length q q setservent symlink eof gethostent rename join getprotobyname time getpwuid waitpid dbmopen getgrgid printf getpwnam getgrgid getsockopt socket rename send ref q
ppencode 2
The overall of generated program is:
# generate a string to be evaluated: # '$_="";eval "vxxx.xxx.xxx.xxx"', where vxxx.xxx.xxx.xxx is version number # eval it
It is a point that xor
is used instead of ;
, because unlike and
and or
, it always evaluate both operands from left to right.
Links
- ppencode in JavaScript (from the Wayback Machine; retrieved on 14 November 2005)
- ppencode in Perl (from the Wayback Machine; retrieved on 15 December 2005)
- ppencode 2