-
-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Added initial codedrop for the asm package.
- Loading branch information
Showing
10 changed files
with
1,745 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,182 @@ | ||
ASM Utilities | ||
============= | ||
|
||
**Experimental** Do not use this package in production (yet) | ||
|
||
----- | ||
|
||
A semi-advanced EVM assembler. | ||
|
||
**Features** | ||
|
||
- Nested code scoping allows relative jumps | ||
- Execute JavaScript meta-programming inline | ||
- Self-padding data blocks | ||
- TODO: optional Position-Independant-Code | ||
- MIT licensed. | ||
|
||
Command-Line Interface | ||
====================== | ||
|
||
@TODO: Add this to the CLI package. | ||
|
||
``` | ||
/home/ethers> ethers-asm [ --disassemble ] [ FILENAME ] | ||
``` | ||
|
||
Syntax | ||
====== | ||
|
||
Comments | ||
-------- | ||
|
||
Any text that occurs after a semi-colon (i.e. `;`) is treated as a comment. | ||
|
||
``` | ||
; This is a comment. If a comments spans multiple | ||
; lines, it needs multiple semi-colons. | ||
@foobar: ; Here is another comment | ||
``` | ||
|
||
Opcodes | ||
------- | ||
|
||
Each OPCODE may be specified using either the **functional notations** or | ||
the **stack notation**. | ||
|
||
### Functional Notation | ||
|
||
This is the recommended syntax for opcodes as the assembler will perform | ||
the additional step of verifying the correct number of operands are passed | ||
in for the giver operation. | ||
|
||
``` | ||
blockhash(sub(number, 1)) | ||
``` | ||
|
||
### Stack Notation | ||
|
||
This method is often useful when adapting other existing disassembled | ||
bytecode. | ||
|
||
``` | ||
1 | ||
number | ||
sub | ||
blockhash | ||
``` | ||
|
||
Labels | ||
------ | ||
|
||
Labels are used for control flow, by providing an achor that can be used | ||
by `JUMP` and `JUMPI`. | ||
|
||
A label is relative to its **scope** and cannot be references outside of | ||
its exact scope and automatically injects a `JUMPDEST` opcode. | ||
|
||
``` | ||
@top: | ||
jump($top) | ||
``` | ||
|
||
Data Blocks | ||
----------- | ||
|
||
Sometimes verbatim data is desired, for example, embedding strings or | ||
look-up tables. | ||
|
||
This can be any number of **hexstrings**, **decimal bytes** or **evals**. | ||
|
||
A **data block** is automatically padded to ensure that any data that is | ||
coincidentally a PUSH opcode does not impact code or data outside the | ||
data block. | ||
|
||
A **data** exposes two variables: the offset (in the current scope) `$foo` | ||
and `#foo`, the length of the data. The offset may only be accessed from an | ||
ancestor scope while the length may be accessed from any scope. | ||
|
||
``` | ||
codecopy(0x20, $foobar, #foobar) ; Copy the data to memory address 32 | ||
@foobar [ | ||
0x1234 ; This is exactly 2 bytes (i.e. 4 nibbles) | ||
42 65 73 ; These are decmial values (3 bytes) | ||
] | ||
``` | ||
|
||
Scopes | ||
------ | ||
|
||
A scope is a new frame of reference, which offsets will be based on. This | ||
makes embedding code within code easier, since the jump destinations and | ||
**data blocks** can be accessed relatively. | ||
|
||
The top scope is named `_`. | ||
|
||
``` | ||
// This runs the deployment | ||
sstore(0, ${{ toUtf8Bytes("Hello World") }}) | ||
codecopy(0, $deployment, #deployment) | ||
return (0, #deployment) | ||
@contract { | ||
@label: | ||
jump($label) | ||
} | ||
``` | ||
|
||
Evaluation and Execution | ||
------------------------ | ||
|
||
It is often useful to be able to modify a program in more advanced ways | ||
at code generation time. JavaScript code can be executed in a `{{! code }}`` | ||
which does not place any output in the code, but can be used to define | ||
functions and variables and code can be evaluated in a ``{{= code }}`` | ||
which will place the output of the *code* into the assembled output, following | ||
the same rules as **Data Blocks**. | ||
|
||
``` | ||
{{! | ||
function foo() { return 42; } | ||
}} | ||
{{= foo() }} | ||
1 | ||
add | ||
``` | ||
|
||
Notes | ||
===== | ||
|
||
Because of the nature of script evaluation, it is possible to create | ||
programs which cannot actually be assembled. The assembler will give | ||
up after 512 attempts to find a stable organization of the code. | ||
|
||
For example, this code contains a scope named `junk`, which is a `CALLER` | ||
statement followed by a data block equal to the bytecode of `junk`. Since | ||
this is recursive, there is never any way for this to be satisfied. This is | ||
similar to VHDL programs where it is possible to simulate recursion, but | ||
impossible to synthesize recursive hardware. | ||
|
||
``` | ||
@junk { | ||
caller | ||
@thisIsRecursive[ | ||
{{= junk }} | ||
] | ||
} | ||
``` | ||
|
||
Building | ||
======== | ||
|
||
If you make changes to the `grammar.jison` file, make sure to run the `npm generate` | ||
command to re-build the AST parser. | ||
|
||
License | ||
======= | ||
|
||
MIT License. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,13 @@ | ||
"use strict"; | ||
|
||
const fs = require("fs"); | ||
|
||
const jison = require("jison") | ||
|
||
const grammar = fs.readFileSync("grammar.jison").toString(); | ||
|
||
const parser = new jison.Parser(grammar); | ||
|
||
const parserSource = parser.generate({ moduleName: "parser" }); | ||
|
||
fs.writeFileSync("./lib/_parser.js", parserSource); |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,168 @@ | ||
%lex | ||
|
||
%x script | ||
|
||
%% | ||
|
||
// Inline JavaScript State (gobble up all tokens including whitespace) | ||
|
||
"{{=" %{ this.begin("script"); return "SCRIPT_EVAL"; %} | ||
"{{!" %{ this.begin("script"); return "SCRIPT_EXEC"; %} | ||
<script>([^\}]|\n|\}[^}]) return "SCRIPT_TOKEN"; | ||
<script>"}}" this.popState() | ||
|
||
|
||
// Assembler State | ||
|
||
// Ignorables | ||
([;][^\n]*\n) // Ignore comments | ||
(\s+) // Ignore Whitespace | ||
|
||
// Identifiers (and opcodes) | ||
([A-Za-z][A-Za-z0-9]*) return "ID" | ||
|
||
// Lists | ||
"(" return "OPEN_PAREN" | ||
")" return "CLOSE_PAREN" | ||
"," return "COMMA" | ||
|
||
// Labels prefixes | ||
([@][A-Za-z][A-Za-z0-9]*) return "AT_ID" | ||
([#][A-Za-z][A-Za-z0-9]*) return "HASH_ID" | ||
([$][A-Za-z][A-Za-z0-9]*) return "DOLLAR_ID" | ||
|
||
// Scope | ||
"{" return "OPEN_BRACE" | ||
"}" return "CLOSE_BRACE" | ||
":" return "COLON" | ||
|
||
// Data | ||
"[" return "OPEN_BRACKET" | ||
"]" return "CLOSE_BRACKET" | ||
|
||
// Literals | ||
(0x([0-9a-fA-F][0-9a-fA-F])*) return "HEX" | ||
([1-9][0-9]*|0) return "DECIMAL" | ||
//(0b[01]*) return "BINARY" | ||
|
||
// Special | ||
<<EOF>> return "EOF" | ||
return "INVALID" | ||
|
||
/lex | ||
|
||
%start program | ||
|
||
%% | ||
|
||
program | ||
: statement_list EOF | ||
{ return { type: "scope", name: "_", statements: $1, loc: getLoc(yy, null) }; } | ||
; | ||
|
||
javascript | ||
: /* empty */ | ||
{ $$ = ""; } | ||
| SCRIPT_TOKEN javascript | ||
{ $$ = $1 + $2; } | ||
; | ||
|
||
opcode_list | ||
: opcode | ||
{ $$ = [ $1 ]; } | ||
| opcode COMMA opcode_list | ||
{ { | ||
const opcodes = $3.slice(); | ||
opcodes.unshift($1); | ||
$$ = opcodes; | ||
} } | ||
; | ||
|
||
opcode | ||
: ID | ||
{ $$ = { type: "opcode", bare: true, mnemonic: $1, operands: [ ], loc: getLoc(yy, @1) }; } | ||
| ID OPEN_PAREN CLOSE_PAREN | ||
{ $$ = { type: "opcode", mnemonic: $1, operands: [ ], loc: getLoc(yy, @1, @3) }; } | ||
| ID OPEN_PAREN opcode_list CLOSE_PAREN | ||
{ $$ = { type: "opcode", mnemonic: $1, operands: $3, loc: getLoc(yy, @1, @4) }; } | ||
| HASH_ID | ||
{ $$ = { type: "length", label: $1.substring(1), loc: getLoc(yy, @1) }; } | ||
| DOLLAR_ID | ||
{ $$ = { type: "offset", label: $1.substring(1), loc: getLoc(yy, @1) }; } | ||
| HEX | ||
{ $$ = { type: "hex", value: $1, loc: getLoc(yy, @1) }; } | ||
| DECIMAL | ||
{ $$ = { type: "decimal", value: $1, loc: getLoc(yy, @1) }; } | ||
| SCRIPT_EVAL javascript | ||
{ $$ = { type: "eval", script: $2, loc: getLoc(yy, @1, @2) }; } | ||
; | ||
|
||
hex_list | ||
: /* empty */ | ||
{ $$ = [ ]; } | ||
| hex hex_list | ||
{ { | ||
const hexes = $2.slice();; | ||
hexes.unshift($1); | ||
$$ = hexes; | ||
} } | ||
; | ||
|
||
hex | ||
: HEX | ||
{ $$ = { type: "hex", verbatim: true, value: $1, loc: getLoc(yy, @1) }; } | ||
| DECIMAL | ||
{ { | ||
const value = parseInt($1); | ||
if (value >= 256) { throw new Error("decimal data values must be single bytes"); } | ||
$$ = { type: "hex", verbatim: true, value: ("0x" + (value).toString(16)), loc: getLoc(yy, @1) }; | ||
} } | ||
| SCRIPT_EVAL javascript | ||
{ $$ = { type: "eval", verbatim: true, script: $2, loc: getLoc(yy, @1, @2) }; } | ||
; | ||
|
||
statement_list | ||
: /* empty */ | ||
{ $$ = [ ]; } | ||
| statement statement_list | ||
{ { | ||
const statements = $2.slice(); | ||
statements.unshift($1); | ||
$$ = statements; | ||
} } | ||
; | ||
|
||
statement | ||
: opcode | ||
| AT_ID COLON | ||
{ $$ = { type: "label", name: $1.substring(1), loc: getLoc(yy, @1, @2) }; } | ||
| AT_ID OPEN_BRACE statement_list CLOSE_BRACE | ||
{ $$ = { type: "scope", name: $1.substring(1), statements: $3, loc: getLoc(yy, @1, @4) }; } | ||
| AT_ID OPEN_BRACKET hex_list CLOSE_BRACKET | ||
{ $$ = { type: "data", name: $1.substring(1), data: $3, loc: getLoc(yy, @1, @4) }; } | ||
| SCRIPT_EXEC javascript | ||
{ $$ = { type: "exec", script: $2, loc: getLoc(yy, @1, @2) }; } | ||
; | ||
|
||
%% | ||
function getLoc(yy, start, end) { | ||
if (end == null) { end = start; } | ||
let result = null; | ||
if (start) { | ||
result = { | ||
first_line: start.first_line, | ||
first_column: start.first_column, | ||
last_line: end.last_line, | ||
last_column: end.last_column | ||
}; | ||
} | ||
if (yy._ethersLocation) { | ||
return yy._ethersLocation(result); | ||
} | ||
return Object.freeze(result); | ||
} | ||
Oops, something went wrong.