Fork us on GitHub Follow us on Facebook Follow us on Twitter

Version 38 (modified by Jiri Svoboda, 8 years ago) (diff)

NNPS battle plan

Sysel

An effort to design a high-level programming language for writing HelenOS severs and applications.

Note that Sysel syntax is not finalized. Some important language features are missing at the moment (especially visibility control and packaging) so the examples presented will need to change when these are implemented.

Roadmap

Sub-project name Status Description
Sysel Bootstrap Interpreter (SBI) Mostly Interpreter of Sysel written in C. Runs in HelenOS and POSIX.
Sysel Compiler Toolkit (NNPS) In progress Modular compiler of Sysel written in Sysel itself. To produce C and/or LLVM IR.

SBI

SBI is an interpreter of Sysel. It is available stand-alone for POSIX or bundled with HelenOS (only in Bazaar repository, not yet in a stable release). You can run it with the command "sbi source_file.sy". Demos that you can run are available in /src/sysel/demos. Source files comprising the library are in /src/sysel/lib.

You can also run sbi without parameters to enter interactive mode.

SBI still has some missing features, but covers enough of the language to start development of NNPS.

Synopsis of current SBI features

  • Primitive types: bool, char, int, string
  • Compound types: class, multi-dimensional array
  • Other types: delegates, enumerations
  • Objective features: constructors, inheritance, grandfather class, static and non-static method invocation
  • Interfaces
  • Static functions, static member variables, static properties
  • Syntactic sugar: variadic functions, accessor methods (named and indexed properties), autoboxing
  • Arithmetic: big integers, addition, subtraction, multiplication, boolean operators
  • Static type checking (mostly), generic classes (unconstrained), exception handling
  • Bindings: Text file I/O, WriteLine, Exec

Missing SBI features

More important:

  • Access control
  • Method overloading (rejected)
  • Code organization (packages and modules)
  • Explicit overriding (virtual, override)
  • Property overriding

Less important:

  • Division
  • Structs
  • Working with binary data
  • Generic type constraints
  • Operator overloading

Janitorial tasks

  • Add cspan to all error and warning messages.
  • Most run-time errors should have been caught during static checking. They need to be reviewed, effectiveness of static checking verified and run-time errors converted to asserts.
  • All errors should be handled gracefully. Calls to exit() must be eliminated.

NNPS

NNPS (Nativní Nástroje pro Překlad Syslu, en: Native Sysel Compilation Toolkit) is a prospective toolkit written in Sysel itself that should compile Sysel to binary form. Currently it is in preliminary experimentation/planning stage. Also current plan is to only implement a front end, transforming Sysel into low-level but machine-neutral IR. Most likely the first available output option should be C (used as if it were a machine-independent assembly) and the second LLVM IR. The native in NNPS means it is written in Sysel itself (i.e. it should be also self-hosting).

Ideally NNPS should compile natively in POSIX, cross-compile from POSIX to HelenOS and eventually compile natively in HelenOS. The eventually is there because an appropriate backend (i.e. a C compiler) needs to be ported to HelenOS before native compilation is feasible.

NNPS will be bootstrapped using SBI. That is by running SBI(NNPS(NNPS)) we will obtain a binary version of NNPS. This process will presumably require 'significant' computing resources since SBI is rather slow and consumes a lot of memory. Once compiled to binary form, NNPS should be much more modest.

Currently the NNPS lexer and skeleton parser has been implemented (it verifies that input is syntactically valid, but nothing else). I am now focusing on developing NNPS while simultaneously improving SBI where needed.

NNPS shall process the code in several separate stages. The first few are common with SBI:

Parsing Lex and parse source files to produce a syntax tree
Ancestry resolution Determine ancestry of classes and interfaces
Typing Annotate syntax tree with static types and make all type conversions explicit

While the remaining are specific to NNPS:

Code lowering A.k.a. code generation. Produce CFG with linear blocks of instructions. Implements/eliminates structured code and OO features.
Data lowering Implements/eliminates structured data, strings, big integers. We get a CFG again, but with a different instruction set.
Output translation Conversion to the desired output format (LLVM IR, C). Straightforward.

From the code lowering phase we obtain a CFG where the instructions operate on structured data (objects, arrays), but the code is strictly procedural (functions, but no methods, no inheritance). The data lowering phase translates these instructions into another instruction set that is more like an abstract CPU instruction (or LLVM IR). Thus in this phase, we need to implement the objects, arrays, strings and big integers. The output translation should be a simple 1-1 translation.

Interesting reading material: