Skip to content

OAMichael/VM-Course

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

84 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Virtual Machine

Build run and test

Build

$ git clone --recurse-submodules https://github.com/OAMichael/VM-Course.git
$ cd VM-Course
$ cmake -B build -DCMAKE_BUILD_TYPE=Release && cd build
$ cmake --build .

Run main program

$ cmake --build . --target run

Run all tests

$ cmake --build . --target run_all_tests

About the architecture of Virtual Machine

General state

There are main components: registers and memory. This VM uses accumulator based ISA. Therefore, accumulator belongs to global state in addition to previous two components.

Virtual Machine has Interpreter which executes all instructions from bytecode operating registers, memory and accumulator. Interpreter works with current frame. Frame consists of PC (Program Counter), registers (256 per frame now) and pointer to parent frame.

Every time Interpreter goes into function a new frame is created. As long as function execution is going on the frame lives. After function execution has ended the frame is being deleted and execution of parent function continues. Every function is associated with its frame and every frame has its own set of registers. Functions can receive arguments from parent function. They are translated via first N registers. If function has N arguments, registers from x0 to x(N-1) are copied from parent frame to current one. When function returns return value is stored in the accumulator.

There is an ISA of Virtual Machine. One can find full description of every instruction below.

Registers

Registers are just cells of memory. Every register is 64 bits wide and can store objects or pointers to them. Objects are integers, floating point values or strings. Register can be considered as union of all mentioned above. There are 3 (by now) basic types of objects: 64 bit signed integer, 64 bit floating point number with double precision and string.

Memory

Memory state

Memory is large storage for bytecode and objects. Whole memory is divided into four regions:

  • Instructions. This is the place where bytecode of program lives
  • Constant pool. Here are all constants from source code
  • String pool. All strings live here
  • Memory arena. All dynamic objects are allocated here

Every instruction is 4 byte wide. Description of instructions can be found below the page. They are just stored in instructions region. Every program has entrypoint. This is the address of first instruction to be executed. Default entrypoint is 0 but if program contains main function then its address is the entrypoint.

Constants in constant pool have additional information. Every constant has its own basic type. If type is integer or floating, then value is taken straight from nearby. If type is string, then nearby value is pointer to string in string pool.

String pool contains strings. Every entry in the pool has information about length of the string and the string itself.

Arena allocator allocates memory in the arena. It is just simple carriage advancement which allows allocation dynamic objects.

Accumulator

Accumulator is special register which is shared between all frames of a program. It can be considered as a mean of communication between functions as well as frames.

Bytecode generation

One can find examples and tests for pseudo assembly-language in directory asm/tests. To convert *.asm files into binary bytecode one should use Asm2Bin tool. Here is an example:

./Asm2Bin ../../asm/tests/arithmetic.asm

Output:

Successfully translated the program: ../../asm/tests/arithmetic.asm

Instruction Set Architecture (ISA)

Here is full description of all supported instructions and their implementation.

All instructions can be divided into several types: A, I, C, R, B, N, CALL, ALLOCA.

Type A

Type A layout

It is the type of instructions which need only register. r1 is an index of register to be used.

  • LOAD_ACC

    Load value of r1 into accumulator

    accr1

  • STORE_ACC

    Store value of accumulator into r1

    r1acc

  • TO_FLOAT_REG

    Cast integer value of r1 to floating point and store result into accumulator

    acc.f_val(double)r1.i_val

  • TO_INT_REG

    Cast floating value of r1 to integer and store result into accumulator

    acc.i_val(int)r1.f_val

  • ADD

    Add integer value of r1 to integer value of accumulator and store result into accumulator

    acc.i_valacc.i_val + r1.i_val

  • SUB

    Subtract integer value of r1 from integer value of accumulator and store result into accumulator

    acc.i_valacc.i_val - r1.i_val

  • MUL

    Multiply integer value of r1 with integer value of accumulator and store result into accumulator

    acc.i_valacc.i_val * r1.i_val

  • DIV

    Divide integer value of accumulator by integer value of r1 and store result into accumulator

    acc.i_valacc.i_val / r1.i_val

  • AND

    Bitwise AND of integer value of r1 and integer value of accumulator and store result into accumulator

    acc.i_valacc.i_val & r1.i_val

  • OR

    Bitwise OR of integer value of r1 and integer value of accumulator and store result into accumulator

    acc.i_valacc.i_val | r1.i_val

  • XOR

    Bitwise XOR of integer value of r1 and integer value of accumulator and store result into accumulator

    acc.i_valacc.i_val ^ r1.i_val

  • SL

    Shift left integer value of accumulator by integer value of r1 and store result into accumulator

    acc.i_valacc.i_val << r1.i_val

  • SR

    Shift right integer value of accumulator by integer value of r1 and store result into accumulator

    acc.i_valacc.i_val >> r1.i_val

  • ADDF

    Add floating point value of r1 to floating point value of accumulator and store result into accumulator

    acc.f_valacc.f_val + r1.f_val

  • SUBF

    Subtract floating point of r1 from floating point value of accumulator and store result into accumulator

    acc.f_valacc.f_val - r1.f_val

  • MULF

    Multiply floating point value of r1 with floating point value of accumulator and store result into accumulator

    acc.f_valacc.f_val * r1.f_val

  • DIVF

    Divide floating point value of accumulator by floating point value of r1 and store result into accumulator

    acc.f_valacc.f_val / r1.f_val

Type I

Type I layout

It is the type of instructions which need only immediate value. ImmediateIdx is an index of constant in constant pool to be used. Type information is stored in the constant itself. Therefore, these operations support integer, floating point and for some even strings.

  • LOAD_ACCI

    Load immediate value into accumulator (for types integer, floating, string)

    accimm (for integer and floating)

    accimm.p_str (for string)

  • ADDI

    Add immediate value to accumulator and store result into accumulator (for types integer, floating)

    accacc + imm

  • SUBI

    Subtract immediate value from accumulator and store result into accumulator (for types integer, floating)

    accacc - imm

  • MULI

    Multiply immediate value with accumulator and store result into accumulator (for types integer, floating)

    accacc * imm

  • DIVI

    Divide accumulator by immediate value and store result into accumulator (for types integer, floating)

    accacc / imm

  • ANDI

    Bitwise AND of immediate value and integer value of accumulator and store result into accumulator (for type integer)

    acc.i_valacc.i_val & imm

  • ORI

    Bitwise OR of immediate value and integer value of accumulator and store result into accumulator (for type integer)

    acc.i_valacc.i_val | imm

  • XORI

    Bitwise XOR of immediate value and integer value of accumulator and store result into accumulator (for type integer)

    acc.i_valacc.i_val ^ imm

  • SLI

    Shift left integer value of accumulator by immediate value and store result into accumulator (for type integer)

    acc.i_valacc.i_val << imm

  • SRI

    Shift right integer value of accumulator by immediate value and store result into accumulator (for type integer)

    acc.i_valacc.i_val >> imm

  • JMP

    Unconditional jump by given relative offset (for type integer)

    pcpc + imm

Type C

Type C layout

It is the type of instructions which need neither register nor immediate and may possibly work only with accumulator.

  • TO_FLOAT

    Cast integer value of accumulator to floating point and store result into accumulator

    acc.f_val(double)acc.i_val

  • TO_INT

    Cast floating point value of accumulator to integer and store result into accumulator

    acc.i_val(int)acc.f_val

  • NEG

    Negate integer value of accumulator and store result into accumulator

    acc.i_val-acc.i_val

  • NEGF

    Negate floating point value of accumulator and store result into accumulator

    acc.f_val-acc.f_val

  • SIN

    Calculate sine of floating point value of accumulator and store result into accumulator

    acc.f_valsin(acc.f_val)

  • COS

    Calculate cosine of floating point value of accumulator and store result into accumulator

    acc.f_valcos(acc.f_val)

  • SQRT

    Calculate square root of floating point value of accumulator and store result into accumulator

    acc.f_valsqrt(acc.f_val)

  • RET

    Return from function and delete its frame

Type R

Type R layout

It is the type of instructions which need two registers.

  • MV

    Copy value from r2 to r1

    r1r2

Type B

Type B layout

It is the type of instructions which need both register and immediate.

  • MVI

    Copy value of imm to r1

    r1imm

  • LOAD_ACC_MEM

    Load value of memory at position r1 + imm into accumulator

    accmem[r1 + imm]

  • STORE_ACC_MEM

    Store value of accumulator into memory at position r1 + imm

    mem[r1 + imm]acc

  • BEQ

    Branch if values of accumulator and r1 are equal

    pcacc == r1 ? pc + imm : pc

  • BNE

    Branch if values of accumulator and r1 are not equal

    pcacc != r1 ? pc + imm : pc

  • BGE

    Branch if integer value of accumulator greater or equal than value of r1

    pcacc.i_val >= r1.i_val ? pc + imm : pc

  • BGT

    Branch if integer value of accumulator greater than value of r1

    pcacc.i_val > r1.i_val ? pc + imm : pc

  • BLE

    Branch if integer value of accumulator less or equal than value of r1

    pcacc.i_val <= r1.i_val ? pc + imm : pc

  • BLT

    Branch if integer value of accumulator less than value of r1

    pcacc.i_val < r1.i_val ? pc + imm : pc

  • BGEF

    Branch if floating point value of accumulator greater or equal than value of r1

    pcacc.f_val >= r1.f_val ? pc + imm : pc

  • BGTF

    Branch if floating point value of accumulator greater than value of r1

    pcacc.f_val > r1.f_val ? pc + imm : pc

  • BLEF

    Branch if floating point value of accumulator less or equal than value of r1

    pcacc.f_val <= r1.f_val ? pc + imm : pc

  • BLTF

    Branch if floating point value of accumulator less than value of r1

    pcacc.f_val < r1.f_val ? pc + imm : pc

Type N

Type N layout

It is the type of instructions for intrinsics. intrCode is the code of intrinsic. Here is the list of all intrinsics.

  • SCAN

    Read integer value from stdin and store it into accumulator

    acc.i_val(int)stdin

  • PRINT

    Write integer value of accumulator to stdout

    stdoutacc.i_val

  • SCANF

    Read floating point value from stdin and store it into accumulator

    acc.f_val(double)stdin

  • PRINTF

    Write floating point value of accumulator to stdout

    stdoutacc.f_val

  • SCANS

    Read string from stdin, allocate memory for it in string pool and store its address into accumulator

    strstdin

    acc.i_val&str

  • PRINTS

    Write string located in string pool at address stored in accumulator to stdout

    strstrPool[acc.i_val]

    stdoutstr

  • STRLEN

    Calculate length of string located in string pool at address stored in accumulator and store it into accumulator

    strstrPool[acc.i_val]

    acc.i_vallength(str)

  • STRCAT

    Concatenate two strings located in string pool whose addresses stored in x0 and x1 registers and store address of newly allocated string into accumulator

    str0strPool[x0.i_val]

    str1strPool[x1.i_val]

    strstr0 + str1

    acc.i_val&str

  • SUBSTR

    Obtain substring of original string located in string pool at address stored in accumulator by indices stored in x0 and x1 registers and store address of newly allocated string into accumulator

    strstrPool[acc.i_val]

    substrstr[x0.i_val, x1.i_val]

    acc.i_val&substr

Type CALL

Type CALL layout

It is the type of instruction for call a function. numArgs is the number of arguments of the function.

  • CALL

    Call a function

    parentFramecurrFrame

    currFramenew Frame

    currFrame.parentparentFrame

    currFrame.pcparentFrame.pc

    currFrame.regfile[0, numArgs]parentFrame.regfile[0, numArgs]

Type ALLOCA

Type ALLOCA layout

It is the type of instructions for dynamic object allocation at arena. objType is number repressenting basic object type.

  • NEW

    Allocate new object of objType at arena and store its address into accumulator

    accnew objType

  • NEWARRAY

    Allocate new array of objects of objType of size acc.i_val at arena and store its address into accumulator

    accnew objType[acc.i_val]

Frontend for Virtual Machine

High-level languange is very similar to C style, except there are no so much semilocons and one have to explicitly define function by function keyword. Here is quick example:

void function main() {
    prints("Enter a: ")
    float a = scanf()

    prints("Enter b: ")
    float b = scanf()

    prints("Enter c: ")
    float c = scanf()

    float D = b * b - 4.0 * a * c

    if (D < 0.0) {
        prints("No roots\n")
        return
    }

    if (D == 0.0) {
        float x = -b / (2.0 * a)
        prints("One root:\n")
        printf(x)
    }
    else {
        float sqrtD = sqrt(D)
        float x_1 = (-b - sqrtD) / (2.0 * a)
        float x_2 = (-b + sqrtD) / (2.0 * a)

        prints("Two roots:\n")
        printf(x_1)
        printf(x_2)
    }

    return
}

There are two version of Frontend: VMFrontend and VMCustomFrontend. First one using Flex & Bison. One can use it like this being in folder build/frontend:

./VMFrontend ../../frontend/Tests/squareEq.lang

Second version of frontend is self-written lexer + Recursive Descent parser. Its usage is the same:

./VMCustomFrontend ../../frontend/Tests/squareEq.lang

One can append extra -v (verbose) flag for both frontends to view parsed tokens, built AST and generated program

./VMCustomFrontend ../../frontend/Tests/squareEq.lang -v

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published