Advanced Vector Translator: documentation (page 2)

[Back]

structures

The structure is a composite data type that encapsulates a set of values of different data types. The value is associated by field identifier in which it is stored. A structure can be inherited from another structure: in this case, the fields of parent structure become available. All structures in AVT are inherited from System.Pointer structure (the Pointer structure declared in System namespace and not containing fields). Declarations of structure fields, as well as elements of namespace, can begin with public keyword which will make them available for use in other namespaces. It is also possible to declare anonymous fields to reserve memory regions. Fields can have a data type of the structure itself which allows you to create linked lists. Structure identifiers are recommended to be written with a capital letter and field identifiers are recommended to be written with a lowercase letter. Here are examples of structures:

public struct ListItem /* inherited from System.Pointer */
{
    /* reference to next element of the list */
    ListItem next;
}

public struct Window(ListItem) /* inherited from ListItem */
{
    /* window aceessibility and visibility */
    public boolean enabled;
    public boolean visible;
    /* window position on the screen */
    public int left;
    public int top;
    /* window dimensions */
    public int width;
    public int height;
    /* window title */
    public char[] caption;
    /* window events */
    public void(Window /*this*/) hideNotify;
    public void(Window /*this*/) showNotify;
    public void(Window /*this*/, boolean /*active*/) paint;
    public void(Window /*this*/, int /*keyCode*/, int /*charCode*/) keyPressed;
    public void(Window /*this*/, int /*keyCode*/, int /*charCode*/) keyRepeated;
    public void(Window /*this*/, int /*keyCode*/) keyReleased;
    public void(Window /*this*/, int /*x*/, int /*y*/) pointerPressed;
    public void(Window /*this*/, int /*x*/, int /*y*/) pointerDragged;
    public void(Window /*this*/, int /*x*/, int /*y*/) pointerReleased;
    /* reserved */
    public Pointer;
    public Pointer;
    public Pointer;
}

The application of structures is quite extensive and with them can imitate object-oriented features. Declaring functions within structures is not possible in AVT, but you can declare fields that refer to functions. Take, for example, the paint field from example above. Its data type, if you omit all comments, will be void(Window, boolean). This means that in this field you can write a reference to a function that doesn’t have a return value (the return type is void) and takes two arguments: the first argument is a reference to the Window structure and the second argument is a boolean value.

variables

There are two types of variables: global and local. Global variables are declared in namespaces and can be marked with public keyword. Local variables are declared within functions and accessed within these functions only. Local variables can be initialized immediately.

Variable identifiers are recommended to be written with a lowercase letter. Unlike some other programming languages, in AVT it is not allowed to declare several variables with the same data type, like this:

int i, j, k = 0;

Such code is not compiled. Instead write like this:

int i;
int j;
int k = 0;

Because of this, the for loop syntax has changed slightly. But this will discussed in section «Control flow operators».

exceptions

Important: exceptions and control flow operators for their handling are available for 64-bit programmes only.

Before proceeding to the syntax of declaring exceptions, it should be noted that exceptions are of two kinds: exceptions of the programming language and exceptions of the operating system.

Programming language exceptions can be raised using throw operator and assembler instructions. Such exceptions are handled with try…catch operator.

Operating system exceptions can be raised by any programme instruction. In this case, an exception handler is called in the OS kernel, and then OS itself decides how to deal with programme: complete it or pass control to the user’s exception handler. If OS supports user’s exception handlers, then any operating system exception can be converted to programming language exception and handled appropriately. Exceptions of the operating system include, for example, arithmetic exceptions (division by zero, overflow, etc.), segmentation fault (an attempt to read data on a null pointer), an attempt to execute a privileged instruction, and so on.

Operating system exceptions handling is beyond the scope of this documentation, so only programming language exceptions will considered below.

For AVT exceptions inheritance is typical, as for structures. The base exception is System.Throwable.  If parent exception is not specified, it becomes System.Exception. Exception identifiers are recommended to be written with a capital letter. Here are examples of exceptions:

public exception MonitorStateException; /* inherited from System.Exception */
public exception IllegalArgumentException(RuntimeException);
public exception NumberFormatException(IllegalArgumentException);

functions

Functions in AVT play the most important role: it is in them that all the functionality of your programme is located. All functions have a return value and arguments of a certain type. If the function doesn’t return a value, its return data type will be void.

Functions can be written either in high-level language, using expressions and control flow operators of AVT, or in language of assembler instructions. In latter case, before the return data type, you should write the assembler or pureassembler keyword, and the function body will be completely written in flat assembler. The text of such functions will simply be inserted by compiler into target file without any changes and syntax checks. If the assembler is specified, the compiler inserts the enter instruction at beginning of function and the pair of leave and ret instructions at end of function. If the pureassembler is specified, then there will be no such insertions.

If function is written in a high-level language and has a return value, then it must have a return operator to return a value. The throw operator is also valid if it returns function.

Function identifiers are recommended to be written with a lowercase letter. Here are some examples of functions:

public boolean isNaN(real x)
{
        return x != x;
}

public int scalarProductu(ultra a, ultra b)
{
        return (a ****= b)[0] + a[1] + a[2] + a[3];
}

public float scalarProductx(xvector a, xvector b)
{
        return (a ****= b)[0] + a[1] + a[2] + a[3];
}

public assembler float sqrtf(float x)
{
        movss   xmm0, [.x]
        sqrtss  xmm0, xmm0
}

public assembler double sqrtd(double x)
{
        movsd   xmm0, [.x]
        sqrtsd  xmm0, xmm0
}

public assembler real sqrtr(real x)
{
        fld     tbyte[.x]
        fsqrt
}

Function overloading is not allowed in AVT, therefore functions within same namespace must have different identifiers.

interrupts

Interrupts are of interest to developers of operating system kernels and DOS programmes. The interrupt in AVT is the function marked with interrupt keyword and has the following properties:

  • the return type is always void;
  • arguments is general-purpose registers;
  • return instruction is iret (iretd, iretq).

When control enter the interrupt handler, all general-purpose registers are stored on the stack and their values are accessed through arguments, and when control exit the handler, the old values of registers are restored. Modifying the arguments affects value of corresponding register when exiting the handler.

The notes for developer of operating system kernel: since the FPU and SSE registers are not stored on stack, some types of interrupt handlers can not be use them.

Some interrupt handlers have an error code. In this case, it should be declared among the arguments. The methods of declaring interrupt handlers depending on the code length of programme and the presence or absence of an error code can be founded here:

/* Methods of declaring interrupt handlers in 16-bit programmes */

/* Without error code */
interrupt void <identifier>(char flags, char cs, char ip, short ax, short cx, short dx, short bx, short si, short di, short bp) { <handler body> }

/* With error code */
interrupt void <identifier>(char flags, char cs, char ip, short errorCode, short ax, short cx, short dx, short bx, short si, short di, short bp) { <handler body> }
/* Methods of declaring interrupt handlers in 32-bit programmes */

/* Without error code */
interrupt void <identifier>(int eflags, char cs, int eip, int eax, int ecx, int edx, int ebx, int esi, int edi, int ebp) { <handler body> }

/* With error code */
interrupt void <identifier>(int eflags, char cs, int eip, int errorCode, int eax, int ecx, int edx, int ebx, int esi, int edi, int ebp) { <handler body> }
/* Methods of declaring interrupt handlers in 64-bit programmes */

/* Without error code */
interrupt void <identifier>(char ss, long rsp, long rflags, char cs, long rip, InterruptContext registers) { <handler body> }

/* With error code */
interrupt void <identifier>(char ss, long rsp, long rflags, char cs, long rip, long errorCode, InterruptContext registers) { <handler body> }

The public keyword, if it is required, should be placed before interrupt. The InterruptContext structure (64-bit programmes only) should be declared in the System namespace as follows:

public struct InterruptContext
{
    public long rbp;
    public long r15;
    public long r14;
    public long r13;
    public long r12;
    public long r11;
    public long r10;
    public long r9;
    public long r8;
    public long r7; /* rdi */
    public long r6; /* rsi */
    public long r3; /* rbx */
    public long r2; /* rdx */
    public long r1; /* rcx */
    public long r0; /* rax */
}

You can not call the interrupt handler directly, as normal function. However, the handler identifier returns its memory address as a short, int or long value, which can only be used for one single purpose: place it in the interrupt descriptor table (IDT).

initialization and finalization

Initialization and finalization are blocks of code that are executed automatically when the programme start and ends, respectively. In each namespace there can be no more than one initialization and no more than one finalization. Syntax:

initialization
{
    <namespace initialization code>
}

finalization
{
    <namespace finalization code>
}

In essence, these are normal functions with special header, but you can not call them directly: they are called automatically.

entry point

The programme entry point is mandatory and the only function that is outside of any namespace. This function is placed at the very beginning of generated code. The return value is undefined, so it is not allowed to specify any type of return value (even void). All that this function has is the identifier, arguments and body. The body can be written in a high-level language only.

The entry point can be placed in any source. It always placed between the import section and first namespace.

Running the programme always starts at the entry point. If you are creating a binary file, then you just need to pass control to the zero byte of code to start programme execution. In this case, execution will start from entry point.

calling conventions

When writing functions, especially for functions in assembler, you should know the order of function calling:

  • all arguments, regardless of data type, are placed on the stack; the order is from left to right, the first argument is at the bottom of stack, the last argument is at the top of stack; each argument takes an integer number of machine words;
  • the return value, depending on the type, is stored in one of the following registers: ax or eax (boolean, char, byte, short, int, reference types), rax (long, references in 64-bit programmes), st0 (real, st1–st7 should be marked as empty), xmm0 (float, double, ultra, xvector);
  • extracting arguments from the stack is done by the called function using the ret <size of arguments> instruction.

In 16-bit programmes, the return values of boolean, char, byte, short and reference types are placed in ax register. In 32-bit programmes, int added to these types and the return values of these types are placed in eax register. If the ax or eax register is too large to hold a value, then the value is extended by movsx or movzx instruction.

[Next]