Skip to content

Latest commit

 

History

History
251 lines (208 loc) · 12 KB

specification.md

File metadata and controls

251 lines (208 loc) · 12 KB

GOLF reference assembler.

The reference assembler is written in Python3. It turns a human readable assembly file into binary code that can run on the GOLF. The format is a simple one-instruction-per-line, with commas separating arguments. The destinations always come before the operands. For example a = b + c:

add a, b, c

The assembler supports a couple preprocessor quality-of-life enhancements over just a series of instructions:

# Everything after a pound (#) is a comment.
# All operations are one-per-line, end a line in a backslash (\) followed by
# only whitespace to split up long lines.

# There are 27 pre-initialized variables in GOLF, a-z and data. a-z are
# initialized to register objects. Register objects will be replaced in the
# final binary by a register. You may assign arbitrary Python expressions to
# variables yourself as well:
tmp = a

# A label is an identifier of the form [a-zA-Z_][a-zA-Z0-9_]+ followed by a
# colon. It adds a new variable containing a label object that is replaced
# in the final binary by an offset in bytes from the start of the
# instruction stream to the first instruction after the label.
tolower:
    # Arguments may be full Python expressions, as long as they evaluate to
    # a 64-bit integer, a register or a label.
    lw c, -1
    add tmp, c, ord("a") - ord("A") # Equivalent to add a, c, 32.
    sw -1, c
    jmp tolower

Finally, you can put read-only data into the binary, and get back an address to it by calling the data() function. Repeated calls to the same data will return the same address:

# Strings get turned into UTF-8 encoded series of bytes followed by a 0.
hello_world = "Hello, world!"

# Bytes are embedded as-is,
rawbytes = b"\x00\x01"

# Lists of integers will become a series of 64-bit little-endian integers in
# the data section.
small_squares = [n*n for n in range(10)]

    mov a, data(hello_world)
print_loop:
    lbu c, a
    jz  c, done
    inc a
    sw -1, c
done:
    halt

GOLF specification.

The CPU has 26 general-purpose registers, a-z. All registers are 64 bit integers - there is no floating point. Signed integers are stored as two's complement. Memory is read/written in little endian. The GOLF has no pipeline - every instruction is executed in order, and the previous instruction must always complete before the next one starts.

Every instruction can have output registers and input operands. Operands can be source registers, or full 64 bit immediates. The instruction will always take a constant amount of cycles to complete. No two instructions can run at the same time, and all registers/memory are immediately updated after an instruction completes.

There are no flags, exceptions, traps, or any features of that kind. The CPU either runs its program until it halts, or until it encounters an error (division by zero, invalid memory access) - then the CPU halts unconditionally. A virtual machine implementation will likely produce debugging output.

The virtual machine will have a user-defined amount of heap and stack memory available for the GOLF. It is also possible that the memory will grow on-demand. The heap starts at memory address 0, the stack starts at memory address 0x1000000000000000, and both grow upwards. All memory is initially zeroed out. There is no implicitly addressed ztack pointer in GOLF, but z will always be 0x1000000000000000 at program startup, the other registers will be 0.

Memory address 0xffffffffffffffff is special - stores to this address will be written to the virtual machine's stdout, reads come from stdin. You may only use lw and sw to load/store at this address, and both only store / load the lowest byte of the register. lw gives back -1 on EOF.

Instructions do not live in regular memory - they're in a separate instruction memory that's neither readable nor writable. This memory starts at address 0. Jump instructions take addresses into this memory. Execution may not go outside of the bounds of this memory - use the halt instruction to stop execution.

Below is a list of all GOLF instructions. Some instructions are pseudo-instructions and should be translated to real instructions by the assembler. For example mov r, a is the same as add r, a, 0. pseudo-instructions will be marked by an apostrophe (') in the listing.

Here are all the simple register-to-register instructions:

instruction    | cycle | mnemonic               | C syntax (uint64_t a, b)
---------------+-------+------------------------+---------------------------
mov  r, a    ' |    1  | register move          | r = a
not  r, a      |    1  | bitwise not            | r = ~a
or   r, a, b   |    1  | bitwise or             | r = a | b
xor  r, a, b   |    1  | bitwise xor            | r = a ^ b
and  r, a, b   |    1  | bitwise and            | r = a & b
shl  r, a, b   |    1  | logical shift left     | r = a << b (int64_t b)
shr  r, a, b   |    1  | logical shift right    | r = a >> b (int64_t b)
sal  r, a, b   |    1  | arithmetic shift left  | r = a << b (int64_t a, b)
sar  r, a, b   |    1  | arithmetic shift right | r = a >> b (int64_t a, b)
inc  r       ' |    1  | increment              | r++
dec  r       ' |    1  | decrement              | r--
neg  r       ' |    1  | negate                 | r = -r
add  r, a, b   |    1  | addition               | r = a + b
sub  r, a, b   |    1  | subtraction            | r = a - b
cmp  r, a, b   |    1  | equality               | r = a == b
neq  r, a, b   |    1  | inequality             | r = a != b
le   r, a, b   |    1  | less                   | r = a < b  (int64_t a, b)
ge   r, a, b ' |    1  | greater                | r = a > b  (int64_t a, b)
leq  r, a, b   |    1  | less or equal          | r = a <= b (int64_t a, b)
geq  r, a, b ' |    1  | greater or equal       | r = a >= b (int64_t a, b)
leu  r, a, b   |    1  | less                   | r = a < b
geu  r, a, b ' |    1  | greater                | r = a > b
lequ r, a, b   |    1  | less or equal          | r = a <= b
gequ r, a, b ' |    1  | greater or equal       | r = a >= b

A note on the shifts - unlike in C, any value for b is allowed. Shifts widths bigger than 64 always result in 0, and negative shift widths flip the shift direction. Comparisons result in 0 for false, 1 for true.

The more complex register-to-register instructions:

instruction     | cycle | mnemonic | description
----------------+-------+----------+----------------------------------------
mul  r, s, a, b |    3  | multiply | Signed multiplication of a and b. Low
                |       |          | 64 bits go into r, high 64 bits into s.
mulu r, s, a, b |    3  | multiply | Unsigned multiplication of a and b. Low
                |       |          | 64 bits go into r, high 64 bits into s.
div  r, s, a, b |   10  | divide   | Signed division of a by b. Quotient
                |       |          | goes in r, remainder in s.
divu r, s, a, b |   10  | divide   | Unsigned division of a by b. Quotient
                |       |          | goes in r, remainder in s.

Signed division and modulo work like they do in Python - division floors towards negative infinity and the sign of the remainder is the same as the sign of the divisor.

Memory and I/O instructions (a is a memory address for all below instructions):

instruction    | cycle | mnemonic            | description
---------------+-------+---------------------+------------------------------
lb   r, a      |    5  | load byte           | Load and sign-extend
               |       |                     | 8-bit int at a.
lbu  r, a      |    5  | load unsigned byte  | Load 8-bit int at a.
ls   r, a      |    5  | load short          | Load and sign-extend 16-bit
               |       |                     | int at a.
lsu  r, a      |    5  | load unsigned short | Load 16-bit int at a.
li   r, a      |    5  | load int            | Load and sign-extend 32-bit
               |       |                     | int at a.
liu  r, a      |    5  | load unsigned int   | Load 32-bit int at a.
lw   r, a      |    5  | load word           | Load 64-bit int at a.
sb   a, b      |    1  | store byte          | Store 8-bit int b at a.
ss   a, b      |    1  | store short         | Store 16-bit int b at a.
si   a, b      |    1  | store int           | Store 32-bit int b at a.
sw   a, b      |    1  | store word          | Store 64-bit int b at a.
push a, b    ' |    2  | push                | Store 64-bit int b at a
               |       |                     | and increment a by 8.
pop  r, a    ' |    6  | pop                 | Decrement a by 8 and load
               |       |                     | 64-bit int at a.
rand r         |  100  | random              | Put random 64-bit int in r.

Flow control:

instruction    | cycle | mnemonic            | description
---------------+-------+---------------------+------------------------------
call f         |    1  | function call       | Saves all registers except z
               |       |                     | and jump to f.
ret  ...       |    1  | function return     | Restore all but the given
               |       |                     | registers and jump to caller.
jmp  l       ' |    1  | unconditional jump  | Unconditionally jumps to l.
jz   l, a      |    1  | jump on zero        | Jumps to l if a is zero.
jnz  l, a      |    1  | jump on non-zero    | Jumps to l if a is non-zero.
sz   a, n    ' |    1  | skip on zero        | Skips execution of the next n
               |       |                     | instructions if a is zero. n
               |       |                     | must be constant.
snz  a, n    ' |    1  | skip on non-zero    | Similarly, skips on non-zero.
halt a         |    0  | halt                | Halts the CPU with error code
               |       |                     | a. Error code 0 is success.

Everything below this line is only of interest to those looking to implement a GOLF assembler, virtual machine or other tool.

GOLF instruction encoding.

Every instruction in GOLF is 4 + i bytes long, where i is the total size of immediates used in that instruction. The first 7 bits of the instruction form the instruction id.

If the instruction id is 0x7f, the instruction is a ret instruction. The remaining 25 bits each tell you whether or not that register should be restored, with the least significant bit being a, and the most significant bit y.

Otherwise, the instruction id can be found in the table below. The next 5*5 bits tell you what arguments for the instruction are. Each quintet of bits is to be interpreted as following:

0    immediate value 0
1    8 bit signed immediate
2    16 bit signed immediate
3    32 bit signed immediate
4    64 bit immediate
5-30 register a-z

Then the non-zero immediate values - if any - come after, in the order they were used in the instruction.

All instruction ids can be found in the table below:

id | instr   id | instr   id | instr
---+------   ---+------   ---+------
00 | not     10 | mul     20 | call
01 | or      11 | mulu    21 | jz
02 | xor     12 | div     22 | jnz
03 | and     13 | divu    23 | halt
04 | shl     14 | lb      7f | ret
05 | shr     15 | lbu
06 | sal     16 | ls
07 | sar     17 | lsu
08 | add     18 | li
09 | sub     19 | liu
0a | cmp     1a | lw
0b | neq     1b | sb
0c | le      1c | ss
0d | leq     1d | si
0e | leu     1e | sw
0f | lequ    1f | rand

GOLF binary format.

The first 4 bytes of the binary format is a little-endian 32-bit unsigned number indicating how large the data section is. Then follows that many bytes of data that will be accessible at starting address 0x2000000000000000. Then the instruction stream begins, going until the end of the file.