# Program Number System

**Program Number System** or PNS is a system designed by Aspwil (talk) to take every useable program in a language and turn it into a unique number, that can then be converted back. This is done by using a dictionary to replace tokens in said language with numbers assigned in a bijective base N number system where N is the total number of tokens. This should be applicable to any language where the structure is defined by a limited number of text-based tokens. Notable exception include Piet (as its image-based), Folders (as it's based on folder directories), and Length (as it's based on line length, not token content) and pretty much any other Non-textual language.

## Example

since Brainfuck is really simple we will use it.
brainfuck has 8 tokens `>`

`<`

`+`

`-`

`.`

`,`

`[`

`]`

we assign each of these tokens a number in a base 8 number system starting from one

> 1 < 2 + 3 - 4 . 5 , 6 [ 7 ] 8

so take the hello world program for brainfuck

`+[-->-[>>+>-----<<]<--<---]>-.>>>+.>>..+++[.>]<<<<.+++.------.<<-.>>>>+.`

and translate each char to its equivalent number giving us

`374414711314444422824424448145111351155333751822225333544444452245111135`

we then take this number that's in bijective base 8 and convert it into another number system we want, for instance, (non-bijective) base 10

36896804202016496244466664004224860862682828222202820606840868406

this number uniquely identifies this and only this program.

## Syntax

I will lay out a simple syntax for defining a system like this:

For a language, you can define a dictionary of tokens as a simple list of space-separated tokens

For brainfuck it would be

`> < + - . , [ ]`

You don't really need to use this syntax, you can use any as long as its a list

the number system lang resulting from using PNS can be written as

`PNS( LANG , DICTIONARY )`

so in this case you would write

`PNS( brainfuck, > < + - . , [ ] )`

the resulting lang is one where the program syntax is an array of numbers but is identical functionally to the original

## Use

This can actually be surprisingly useful for compression.

For instance, if you take the brainfuck program above and change it to bytes (base 256) we can take the program and compress it down to 27 bytes of info:

`59 B0 F0 08 36 DB 28 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00`

That's 207 bytes to 27!

This can work even for large languages because you can compress multi-character tokens into a single number.

For a quick example for Java

while(True){ System.out.println("1"); }

you could use tokens such as `while(`

which would compress compress `while(`

from 6 characters to a single value.

This means that even languages with things like strings that would need a token for each possible char in them can often still be compressed, because of multi character tokens getting shortened.

## Notes

Generally, you should not include comments in the replacements and they should be removed before conversion because comments allow arbitrary bytes, and you would have to include all of those in the dictionary.

another interesting thing:

for any program in the family Trivial brainfuck substitution

given

TrivialBrainfuckSubstitution(a,b,c,d,e,f,g,h)

the lang

PNS( TrivialBrainfuckSubstitution(a,b,c,d,e,f,g,h), a b c d e f g h )

has the same number for programs for any set of a, b, c, d, e, f, g, h.

IE program #36896804202016496244466664004224860862682828222202820606840868406

will print hello world no matter what language from the family you use.

## More Mathematical Definition

given LANG A meets specific requirements. (to be worked out and fine-tuned later)

we can generate a DICTIONARY A.

when combined we can say

PSN( LANG A, DICTIONARY A ) defines LANG B

LANG B is directly correlated to LANG A such that it can run every program possible in LANG A, so it is in the same computational class as LANG A

LANG B's syntax is an array of numbers

since LANG B's syntax is an array of numbers ranging from 1 to N (where N is the length of DICTIONARY A) any program in LANG B can thus be treated as a number in a bijective base N number system.

as such it can be converted from a base N bijective number system to a number system with a different base (IE. 10, 256(bytes), etc)

thus any valid program in LANG A can be represented as a unique number in base 10

## An Actual Dictionary

Here is a fully working (at the time of writing) dictionary for (NDBall)

`( , ) { } | > < 0 1 2 3 4 5 6 7 8 9 + - p P $ % E # K a f q n H s L R Y[ S[ St[ ESt PSt[ ] \n`

Whose truth machine code looks like this

(0)>0 (1)% (2)Y[1,>0,>1] (2,1)P (2,2)<1 (3)P (4)E

Transforms into this in the lang `PNS( NDBall , ( , ) { } | > < 0 1 2 3 4 5 6 7 8 9 + - p P $ % E # K a f q n H s L R Y[ S[ St[ ESt PSt[ ] \n )`

1 9 3 7 9 42 1 10 3 24 42 1 11 3 36 10 2 7 9 2 7 10 41 42 1 11 2 10 3 22 42 1 11 2 11 3 8 1 42 1 12 3 22 42 1 13 3 25

Leading to the bijective base 42 number

` 1 9 3 7 9 42 1 10 3 24 42 1 11 3 36 10 2 7 9 2 7 10 41 42 1 11 2 10 3 22 42 1 11 2 11 3 8 1 42 1 12 3 22 42 1 13 3 25`

This when converted into base 10 (assuming my math is not off) is program number

`23860539308579467379463581539340138670700418635087640886070086229953756524467`

or in bytes

`34 C0 96 2F 9E FB 22 D2 8C 09 E0 B4 67 19 C7 67 D9 C2 8C 8A 4B 74 95 03 A9 D4 25 85 FD BF EF B3`

for a total of 32 bytes from the original 42 byte program

## Interpreter

At the time of writing, this language does not have an interpreter. However, one will be in the works once I have figure out how to do bijective system number conversion for arbitrarily large numbers