Skip to content

Latest commit

 

History

History
130 lines (90 loc) · 4.93 KB

ROM Colors.md

File metadata and controls

130 lines (90 loc) · 4.93 KB

ROM Colors

Copyright 2020 Moddable Tech, Inc.

Revised: August 12, 2020

Opportunity

In XS, instances are mostly linked list of properties with a key and a value. To get a value by key, the runtime iterates the linked list to find a property with a matching key.

Of course XS already implements several optimizations to access properties by index: in array instances, in closures, in global and local scopes.

The XS linker prepares most classes, objects, prototypes, functions to be accessed straightly from ROM. Such instances and their properties never change. That is an opportunity to access properties by index, especially since iterating a linked list in ROM is not optimal.

Intuitively indexes could be chosen for keys, then instances could be rearranged to have matching properties at that index. But if indexes are chosen too naively, instances and properties could take a lot of memory. So a technique is necessary to choose indexes while keeping instances and properties as dense as possible.

Graph Coloring

The technique is based on graph coloring. Similar techniques are applied to object programming languages for decades. See for instance:

When intrinsics and all modules are preloaded, the XS linker traverses all instances and their properties to build a conflict graph. The nodes are the keys, the edges are the conflicts. Two keys conflict if there is an instance that has properties for both keys. Choosing indexes for keys is equivalent to coloring nodes in the graph.

Once the graph is colored, each key has a color that can be used to access a property by index.

property = instance[key.color]

Since non conflicting keys can reuse the same color, the runtime has to check if the property matches the key.

if (property.key == key)
	return property

Optimal graph coloring could be complex, but coloring from the most to the least conflicted key usually gives good enough results: the number of colors is significantly smaller than the number of keys.

Memory Layout

What remains to be done is to reorganize the ROM so properties are where the color of their key wants them to be.

Instances and properties are slots. Initially the XS linker allocates slots in the order instances and properties are created.

For instance, here are part of the Math object with a few properties, followed by the JSON object.

Slot Next Kind Key
0 -> 1 Instance Math
1 -> 2 Property imul
2 -> 3 Property max
3 -> 4 Property min
4 NULL Property sign
5 -> 6 Instance JSON
6 -> 7 Property parse
7 NULL Property stringify

Firstly keys are colored. Since keys can conflict in several instances, their colors are never sequential.

Key Color
imul 3
max 1
min 4
parse 4
sign 5
stringify 5

Secondly the properties are moved to the color of their key:

Slot Next Kind Key
0 -> 3 Instance Math
1 -> 4 Property max
2 NULL
3 -> 1 Property imul
4 -> 5 Property min
5 NULL Property sign
6 -> 10 Instance JSON
7 NULL
8 NULL
9 NULL
10 -> 11 Property parse
11 NULL Property stringify

That is enough to access properties by index but there are holes! Here above slots 2, 7, 8, and 9 are unused. Graph coloring minimizes the number of colors, but because of conflicts, some instances can be sparse.

Eventually instances are moved to fill the holes.

Slot Next Kind Key
0 -> 3 Instance Math
1 -> 4 Property max
2 -> 6 Instance JSON
3 -> 1 Property imul
4 -> 5 Property min
5 NULL Property sign
6 -> 7 Property parse
7 NULL Property stringify

Now the two objects are intertwined to reduce the memory footprint. There are usually enough instances without properties or with a few properties to fill all holes.

Despite all movements, the order of the linked lists is maintained. That is required by several object traversal functions.

Results

Here is a simple test.

const math = Math;
const now = Date.now();
for (let i = 0; i < 100000; i++) {
	math.imul(i, 1)
	math.max(i, 0)
	math.min(i, 0)
	math.sign(i)
}
trace((Date.now() - now) + "\n");

The Math object is in ROM and its imul, max, min and sign properties are accessed 100000 times. The Math object is cached into a const to avoid the interference of global scope optimizations, the properties are selected to avoid floating point operations.

Here are results without and with the optimization:

Device Without With Gain
Moddable One 15066 11403 24%
Moddable Two 2906 2353 19%
simulator (macOS) 285 260 9%

That is interesting: less the device is performant, more the gain is significant.