Oak

From Esolang
Jump to navigation Jump to search
Oak
Paradigm(s) reflective, procedural, functional, generic, modular
Designed by Jordan Dehmel
Appeared in Category:2023
Computational class Turing complete
Major implementations Original
Influenced by C, C++, Python, Rust, RegEx
File extension(s) .oak .od

Overview

Oak is a low-level translated/compiled programming language with compile-time syntax modification. This means that Oak gives you the tools to modify the way it operates. Oak relies on the symbol-transformation language Sapling. It has modern generic and package-management sybsystems, and supports the creation of dialects, or syntactically differing branches of the main language. Non-dialectical Oak code is referred to as canonical and uses the '.oak' file extension.

A full guide is available here.

Canonical Oak Syntax

The visual style of Oak takes after Rust (or typed Python), although it internally functions more like C. Variables are declared via the 'let' keyword, and curly-bracket scopes are used as in Rust. Packages and files are included via the 'package!' and 'include!' macros respectively. There is no 'return' keyword, with returns instead being denoted by a lack of semicolon. Return types of functions are denoted via the '->' operator, and the types of variables are noted via the ':' operator.

   /*
   This is a multi-line
   comment
   */
   
   // This is a single-line comment
   
   // Import the std package, which contains print
   // Fn names followed by '!' are macros
   package!("std");
   
   // Declare the main function
   let main() -> i32
   {
       // Print hello world to the terminal
       print("Hello, world!\n");
   
       // Return an exit code of zero- no error
       0
   }

Oak's atomic types are as in Rust, and are listed with their corresponding C++ types below.

  • i8 - unsigned char
  • u8 - char
  • i16 - unsigned short
  • u16 - short
  • i32 - unsigned int
  • u32 - int
  • i64 - unsigned long
  • u64 - long
  • i128 - (not compiler-assured) unsigned long long
  • u128 - (not compiler-assured) long long
  • f32 - float
  • f64 - double
  • bool - bool
  • str - char *

Pointer types are denoted via ^, which is also the de-reference operator. The address-of operator is @.

Object Orientated Programming

Oak does not have classes, only structs and Rust-style enums. Struct/enum members are comma-separated variables w/ type denoted as usual. Struct/enum definitions do not need to be followed by a semicolon.

   let example: struct
   {
       member_a: i32,
       member_b: bool,
       member_c: ^i8,
   }

This is similar for enums. A given enum can hold exactly one of its members at a time, which can be accessed using the match statement.

   // Create an enumeration w/ 3 possible options
   let example: enum
   {
       option_a: i32,
       option_b: bool,
       option_c: ^i8,
   }
   
   // The main function
   let main() -> i32
   {
       // Create an instance of the enum
       let obj: example;
       
       // Wrap it to a specific option
       // (wrap_OPTION_NAME functions are automatically generated for all options)
       obj.wrap_option_a(123);
       
       // Switch control flow depending on the state of the enum
       match (obj)
       {
           case option_a(data)
           {
               // If obj is the option_a option, this area is called.
               // The i32 which option_a refers to is accessible as 'data'
               ;
           }      
           
           case option_b()
           {
               // If obj is the option_b option, this area is called.
               // The bool option_b refers to is NOT accessible, since the
               // parenthesis are empty.
               ;
           }
           
           default
           {
               // Any cases which are not covered direct control flow here.
               // No internal data can be captured here, so no parenthesis.
               ;
           }      
       }
       
       0
   }

Oak does not have inheritance, private members, or internally-declared member functions. Instead, a pre-processor rule (see later) called 'std' maps any functions which take a pointer to an object as their first argument to be used as member functions (methods).

   // Import the std package, for the std rule
   package!("std");
   
   // Use the std rule, allowing methods
   use_rule!("std");
   
   // Struct declaration
   let ex: struct
   {
       data: i32,
   }
   
   // Method function declaration
   let do_thing(self: ^ex) -> i32
   {
       let out: i32;
       out = self.data;
    
       out
   }
   
   // Main function
   let main() -> i32
   {
       // Instantiate
       let obj: ex;
       
       // Legal, but ONLY because we are using the std rule
       // If we didn't use the std rule, this would not work.
       obj.do_thing();
   
       0
   }

In this way, the std rule represents a minor variant from the canonical Oak syntax. Rules can be used to create small quality of life improvements like this one, or to create entire dialects- syntactically differing branches which may not even be considered the same language.

Upon instantiation, a variable will always have the New function called with the object as the first parameter. When the object falls out of scope, the Del function will be called similarly. If a definition for these functions is not defined by the user upon struct declaration, one will be generated by the compiler.

Generic Object Oriented Programming

Generics are denoted in the usual way, via '<t>' for a generic 't'. Note that, like all Oak variable and type names, the generic type must be lowercase. Functions, structs, and enums may all be generic, but variables may not be.

   // Generic definition
   let node<t>: struct
   {
       data: t,
   }
   
   let main() -> i32
   {
       // Instantiate a generic
       let instance: node<i32>;
       
       0
   }

The above code would work equivalently for an enum.

You may note that, since methods are not syntactically linked to their objects by default, the compiler will have no idea which generic methods to instantiate when instantiating a generic object. For example, examine the following code.

   let ex<t>: struct
   {
       member: i32,
   }
   
   let method<t>(self: ^ex<t>) -> void
   {
       ;
   }
   
   let main() -> i32
   {
       let instance: ex<i32>;
       method(instance);
       
       0
   }

This code will error, because no instance of 'method' exists; Only the generic definition. Thus, to create generic methods for generic structs/enums, you must use the needs block.

   let ex<t>: struct
   {
       member: i32,
   }
   needs
   {
       // Anything in here will be instantiated with the struct
       method<t>(self: ^ex<t>);
   }
   
   let method<t>(self: ^ex<t>) -> void
   {
       ;
   }
   
   let main() -> i32
   {
       let instance: ex<i32>;
       method(instance);
       
       0
   }

Anything inside a needs block will be instantiated along with the generic. This also allows for explicit requirements on any generics- for instance, if you needed the generic to be hashable, you could put the signature of the hash function in the needs block.

Oak Rules and Sapling

The most esoteric part of Oak is how its syntax can be modified at compile-time. This is done via pre-processor rules. Rules take the form of two space-separated series of strings. The first series is the input rule, and the second is the output rule. The input rule defines what pattern the rule matches, with each space-separated string representing a token, referred to as a symbol. Each value can be a wildcard, variable, literal, glob, or some combination of these. The sub-language describing the input and output rules is called Sapling, short for "symbol alteration programming language". It is similar to a regular expression for tokens. A full explanation of how to use Sapling can be found here, in the section about pre-processor rules.

In Oak, a rule is defined via the new_rule! macro. The first argument is the name of the rule, the second is the input rule, and the third is the output rule. A rule, once defined, is not automatically used; This must be done via a call to the use_rule! macro. You must pass the names of any rules you want to use as arguments to this. You can also bundle multiple rules together via the bundle_rule! macro, to which the first argument is the bundle name and the following are rules therein. For instance, consider the following code, which provides method calls in the standard ruleset.

   // Method rules
   new_rule!("argless_method", "$a . $b ( )", "$b ( $a )");
   new_rule!("method", "$a . $b (", "$b ( $a ,");

The $a and $b symbols in the input rules represent a single-symbol variable. Any token would match this an be stored under the name $a or $b. The dots and parenthesis are literals. The output rule details how the captured symbols should be replaced; In this case, $b should become the name of a function with $a being the first argument. This works similarly in the second example, although this one allows for arguments in the method call.

Dialects

Dialects are branches of Oak which use pre-processor rules to transform its syntax in a significant way. These can be classified as entirely new languages, or even be implementations of other languages which then translate back to Oak for compilation.

Dialects can be manually loaded via the use_rule! macro, or they can be loaded at the command-line level via the use of .od, or Oak Dialect, files. These are passed to the acorn compiler, at which point all rules therein become automatically active for all files processed. A more thorough description of the usage of .od files can be found here.

.od files consist of string-surrounded input rules, followed by whitespace, followed by string-surrounded output rules. Since these rules are always on, they do not need to have names. A .od file can also include 'clear' at the beginning to remove any other dialect rules, and 'final' at the end to prevent the addition of any further rules. Both of these keywords are optional. Any line in a .od file which starts with '//' will be ignored as a comment.

Macros

Macros are sub-programs which take the form of main functions. They are denoted by a '!' at the end of their function name. They take a number of string arguments, and any code they print is inserted into the calling program. A macro's translation and compilation process is completely separate from the surrounding code- meaning packages and files will have to be re-linked therein. For instance,

   // Include the standard Oak package
   package!("std");
   
   let print_hello_world!(argc: i32, argv: ^^i8) -> i32
   {
       // Re-include this because packages don't carry into macros
       package!("std");
       
       // The printed output of this program will be inserted into the caller as code
       print("print(\"Hello, world!\\n\");\n");
       
       0
   }
   
   let main() -> i32
   {
       print_hello_world!();
       
       0
   }

would become

   // ...
   let main() -> i32
   {
       print("Hello, world!\n");
       
       0
   }

Translation Process

Oak code is first run through any active pre-processor rules. This transforms any syntactical variance back into canonical form. Then, the Acorn translator translates the canonical Oak into C++. This, if indicated, is then compiled and/or linked by clang++.

The Acorn Compiler

The Acorn compiler is available for install here. It is currently only functional on Linux (or WSL). Documentation on its installation and use is available via the README.md file linked there.

Notes

Oak, Acorn and Sapling are all FOSS protected by the GPLv3.

Sources

[1]