Skip to content

Commit

Permalink
Merge immediates into main working line.
Browse files Browse the repository at this point in the history
Although the code size is an issue for the ATmega328P, I believe that can best
be worked on through further structural changes, and it's easier to do those
without trying to keep a large changeset in sync.
  • Loading branch information
anarchodin committed Mar 29, 2021
2 parents ccff492 + ec73b9b commit 77fb98a
Show file tree
Hide file tree
Showing 46 changed files with 425 additions and 199 deletions.
108 changes: 108 additions & 0 deletions doc/immediates.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,108 @@
# Immediate values

Like other Lisps, but unlike languages like C, uLisp attaches type information
to all of its runtime objects. Originally, this was done in a fairly
straightforward and uniform fashion: All objects were represented by two values
of the same size as the underlying machine’s memory pointers. Cons cells, which
always point to other values, had actual pointers in both cells, while other
kinds of objects had a type tag in the `car` cell and a representation of the
object in the `cdr` cell. This representation is simple to understand, but it is
wasteful: To represent an n-bit number we use two n-bit values! There is,
however, a way to increase the efficiency of this scheme for cases that warrant
the effort but retain most of the simplicity where it isn’t.

Notice that even in the simple version we need to have an infallible way to
distinguish between type tags and memory pointers. This turns out to be possible
in very common cases due to alignment constraints. To take an example, if we
ensure that our table of uLisp objects _starts_ at an even memory address, no
valid pointer to a uLisp object will ever be an odd number – even on 8-bit
machines, pointers are always at least two bytes. As it happens, uLisp _already_
uses the lowest bit – the one controlling whether a number is even or odd – for
[garbage collection](http://www.ulisp.com/show?1BD3). That’s fine by us:
Instead, we ensure that the table starts at a multiple of four. You see, a uLisp
object is two pointers, so if the first address is a multiple of four, all of
them are. This enables us to use the other bits to represent objects in a way
that doesn’t require memory to be allocated. Instead of representing n-bit
numbers with 2n bits we can now represent numbers with n-2 bits in n
bits. That’s considerably less wasteful, and all we have to do to know that it
_is_ a number is to check if the bit signifying 2 is set. Great!

There is one catch: Remember those type tags from earlier? We still need those
for our remaining boxed values. Therefore, in order to properly gain, we can’t
just stuff numbers into the n-2 bits. We have to be able, again, to tell type
tags and numbers apart with certainty. We’ll have to sacrifice an additional
bit. Having done that, to figure out whether the value is a number or a type
tag, we need to look at two bits, those representing 6. Since it’s two bits,
there are four possible combinations, but two of those – 0 and 4 – represent a
pointer, not an immediate value. Let’s say that if the value is 2, we have a
number, and if it’s 6 we have a type tag. Then we have n-3 (13, 16, 61) bit
numbers, which seems reasonable. But now a question arises - do we need that
many bits to distinguish between types?

Well, no. And we might want to stick other kinds of values into immediates –
like, say, character values. If using two 16-bit values to represent a 16-bit
number felt wasteful, using those same 32 bits to represent an 8-bit value is
even worse. So we extend this system: Ignoring the lowest-order bit, an
immediate object’s type is identified by the first unset bit. This means we can
check types with fairly simple bitmasks – all the values used are powers of two,
minus two.

## Implemented immediate types

uLisp has implementations that use 16-bit, 32-bit and 64-bit pointers. The
differences in size mean that some aspects of the implementation differ between
platforms. In particular, it is possible to encode user-defined symbols in an
immediate value on the larger machines, which is not reasonable for 16-bit
pointers. For this reason, the immediate types do vary by bit-size. They do not
vary by anything else, however. Certain fundamentals are shared - fixnums always
have the same tag, for example.

### 16-bit

- Fixnums are thirteen-bit signed integers. `(eql (logand fixnum 6) 2)`
- Built-in symbols are eleven-bit values. `(eql (logand symbol 30) 14)`
- Characters are eight-bit unsigned integers. `(eql (logand byte 254) 126)`

### 32-bit

- Fixnums are 29-bit signed integers. `(eql (logand fixnum 6) 2)`
- Symbols are packed into 27 bits. `(eql (logand symbol 30) 14)`
- Characters are 21-bit unsigned integers. `(eql (logand unicode 2046) 1022)`

### 64-bit

For now, 64-bit platforms use the same tags as 32-bit systems.

## Fixnums

Fixnums are the original impetus for immediates. They get three fewer bits than
platform pointers, and are two’s complement signed integers that do not need to
be allocated from the workspace. This can save significant memory, particularly
when combined with arrays. Aside from the size differences, there are no real
platforms specifics.

## Symbols

The implementation of symbols in uLisp is somewhat unorthodox – there is a
strong distinction between symbols that are built in to uLisp and two types of
user symbols. On 32-bit and 64-bit platforms, all symbols are immediate
values. On smaller platforms, only built-in symbols are immediate values, with
user symbols being boxed values.

Using immediate values for built-in symbols on all platforms ensures that their
exact representation is known at compile-time, which is useful in various parts
of the uLisp internals.

## Possible extensions

### Parametric types

The type tags, especially on the larger platforms, are _much_ larger than they
need themselves. It is possible to specify a fixed number of bits to be used for
the type tag itself, and allocate the rest of the bits to some kind of
parameters for the type. A potential use for the platforms that could feasibly
run their own compiler would be to stash calling convention information about
functions in there.

It might also, by complicating the memory allocation mechanism a little, be used
to carry size information for contiguous allocations larger than a single cell.
2 changes: 1 addition & 1 deletion functions/arm/restarti2c.c
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ object *fn_restarti2c (object *args, object *env) {
I2CCount = 0;
if (args != NULL) {
object *rw = first(args);
if (integerp(rw)) I2CCount = rw->integer;
if (intp(rw)) I2CCount = getint(rw);
read = (rw != NULL);
}
int address = stream & 0xFF;
Expand Down
2 changes: 1 addition & 1 deletion functions/arm/withi2c.c
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ object *sp_withi2c (object *args, object *env) {
I2CCount = 0;
if (params != NULL) {
object *rw = eval(first(params), env);
if (integerp(rw)) I2CCount = rw->integer;
if (intp(rw)) I2CCount = getint(rw);
read = (rw != NULL);
}
// Top bit of address is I2C port
Expand Down
2 changes: 1 addition & 1 deletion functions/restarti2c.c
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ object *fn_restarti2c (object *args, object *env) {
I2CCount = 0;
if (args != NULL) {
object *rw = first(args);
if (integerp(rw)) I2CCount = rw->integer;
if (intp(rw)) I2CCount = getint(rw);
read = (rw != NULL);
}
int address = stream & 0xFF;
Expand Down
2 changes: 1 addition & 1 deletion functions/withi2c.c
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ object *sp_withi2c (object *args, object *env) {
I2CCount = 0;
if (params != NULL) {
object *rw = eval(first(params), env);
if (integerp(rw)) I2CCount = rw->integer;
if (intp(rw)) I2CCount = getint(rw);
read = (rw != NULL);
}
I2Cinit(1); // Pullups
Expand Down
8 changes: 4 additions & 4 deletions platforms.lisp
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@

(defparameter *platforms*
'((:avr
(:types zzero symbol number stream character string pair)
(:types zzero symbol number stream string pair)
(:streams serial i2c spi sd)
(:keywords
("CPU_ATmega328P"
Expand All @@ -33,7 +33,7 @@
(ANALOGREAD ADC_DAC0 ADC_TEMPERATURE)))
(:features :dacreference))
(:arm
(:types zzero symbol code number stream character float array string pair)
(:types zzero code number stream float array string pair)
(:streams serial i2c spi sd string gfx)
(:keywords
("CPU_ATSAMD21"
Expand Down Expand Up @@ -71,7 +71,7 @@
(ANALOGREFERENCE DEFAULT EXTERNAL)))
(:features :float :gfx :code :array :stringstream :write-resolution))
(:esp
(:types zzero symbol number stream character float array string pair)
(:types zzero number stream float array string pair)
(:streams serial i2c spi sd wifi string gfx)
(:keywords
("ESP8266"
Expand All @@ -82,7 +82,7 @@
(PINMODE INPUT INPUT_PULLUP INPUT_PULLDOWN OUTPUT)))
(:features :float :gfx :code :array :stringstream :ethernet))
(:riscv
(:types zzero symbol code number stream character float array string pair)
(:types zzero code number stream float array string pair)
(:streams serial i2c spi sd string gfx)
(:keywords
(nil
Expand Down
23 changes: 19 additions & 4 deletions preface.lisp
Original file line number Diff line number Diff line change
Expand Up @@ -5,13 +5,28 @@
;; FIXME: This belongs elsewhere.
(defvar *maximum-trace-count* 3 "The number of functions that can be traced at one time.")

;; NOTE: Done as CPP macros rather than an enum in preparation for further changes.
(defun print-tokens (platform &optional (stream *standard-output*))
"Output token definitions for a given platform."
(let* ((byte-size (if (eq platform :avr) 16 32))
(shift-size (if (= byte-size 16) 12 28))
(base-num (if (= byte-size 16) #x7FE #x7FFFFFE))
(num-length (if (= byte-size 16) 4 8)))
(loop for token in '(:BRA :KET :QUO :DOT)
for i from 0
do (format stream "#define ~a 0x~v,'0x~%"
token num-length
(logior (ash i shift-size)
base-num)))))

(defun print-types (typelist &optional (stream *standard-output*))
"Output type definitions for the given types."
(format stream "~&~%// Types~%")
(format stream "~&~%// Type identifiers. Four last bits fixed at 6.~%")
(let ((value -1))
(dolist (type typelist)
(format stream "#define ~a ~d~%" type (ash (incf value) 1))))
(format stream "#define ~a ~d // (~d << 4 | 6)~%"
type
(logior (ash (incf value) 4) 6)
value)))
(terpri stream))

(defun print-streams (streamlist &optional (stream *standard-output*) (margin 90))
Expand All @@ -29,6 +44,6 @@
(format stream "~&~%// Constants~%~%")
(format stream "const int TRACEMAX = ~d; // Number of traced functions~%" *maximum-trace-count*)
(print-types (get-types platform) stream)
(write-line "enum token { UNUSED, BRA, KET, QUO, DOT };" stream)
(print-tokens platform stream)
(print-streams (get-streams platform) stream)
(terpri stream))
1 change: 0 additions & 1 deletion sections/arm/setup.c
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,6 @@ void initgfx () {

void initenv () {
GlobalEnv = NULL;
tee = symbol(TEE);
}

void setup () {
Expand Down
16 changes: 8 additions & 8 deletions sections/array.c
Original file line number Diff line number Diff line change
Expand Up @@ -28,13 +28,13 @@ object *makearray (symbol_t name, object *dims, object *def, bool bitp) {
int size = 1;
object *dimensions = dims;
while (dims != NULL) {
int d = car(dims)->integer;
int d = getint(car(dims));
if (d < 0) error2(MAKEARRAY, PSTR("dimension can't be negative"));
size = size * d;
dims = cdr(dims);
}
// Bit array identified by making first dimension negative
if (bitp) { size = (size + 31)/32; car(dimensions) = number(-(car(dimensions)->integer)); }
if (bitp) { size = (size + 31)/32; car(dimensions) = number(-getint(car(dimensions))); }
object *ptr = myalloc();
ptr->type = ARRAY;
object *tree = nil;
Expand Down Expand Up @@ -66,7 +66,7 @@ object **getarray (symbol_t name, object *array, object *subs, object *env, int
bool bitp = false;
object *dims = cddr(array);
while (dims != NULL && subs != NULL) {
int d = car(dims)->integer;
int d = getint(car(dims));
if (d < 0) { d = -d; bitp = true; }
if (env) s = checkinteger(name, eval(car(subs), env)); else s = checkinteger(name, car(subs));
if (s < 0 || s >= d) error(name, PSTR("subscript out of range"), car(subs));
Expand All @@ -87,7 +87,7 @@ object **getarray (symbol_t name, object *array, object *subs, object *env, int
rslice - reads a slice of an array recursively
*/
void rslice (object *array, int size, int slice, object *dims, object *args) {
int d = first(dims)->integer;
int d = getint(first(dims));
for (int i = 0; i < d; i++) {
int index = slice * d + i;
if (!consp(args)) error2(0, PSTR("initial contents don't match array type"));
Expand Down Expand Up @@ -144,7 +144,7 @@ object *readbitarray (gfun_t gfun) {
while (head != NULL) {
object **loc = arrayref(array, index>>5, size);
int bit = index & 0x1F;
*loc = number((((*loc)->integer) & ~(1<<bit)) | (car(head)->integer)<<bit);
*loc = number(((getint(*loc)) & ~(1<<bit)) | (getint(car(head)))<<bit);
index++;
head = cdr(head);
}
Expand All @@ -157,13 +157,13 @@ object *readbitarray (gfun_t gfun) {
void pslice (object *array, int size, int slice, object *dims, pfun_t pfun, bool bitp) {
bool spaces = true;
if (slice == -1) { spaces = false; slice = 0; }
int d = first(dims)->integer;
int d = getint(first(dims));
if (d < 0) d = -d;
for (int i = 0; i < d; i++) {
if (i && spaces) pfun(' ');
int index = slice * d + i;
if (cdr(dims) == NULL) {
if (bitp) pint(((*arrayref(array, index>>5, size))->integer)>>(index & 0x1f) & 1, pfun);
if (bitp) pint((getint(*arrayref(array, index>>5, size)))>>(index & 0x1f) & 1, pfun);
else printobject(*arrayref(array, index, size), pfun);
} else { pfun('('); pslice(array, size, index, cdr(dims), pfun, bitp); pfun(')'); }
}
Expand All @@ -178,7 +178,7 @@ void printarray (object *array, pfun_t pfun) {
bool bitp = false;
int size = 1, n = 0;
while (dims != NULL) {
int d = car(dims)->integer;
int d = getint(car(dims));
if (d < 0) { bitp = true; d = -d; }
size = size * d;
dims = cdr(dims); n++;
Expand Down
1 change: 0 additions & 1 deletion sections/avr/setup.c
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,6 @@

void initenv () {
GlobalEnv = NULL;
tee = symbol(TEE);
}

void setup () {
Expand Down
12 changes: 6 additions & 6 deletions sections/closure.c
Original file line number Diff line number Diff line change
Expand Up @@ -3,21 +3,21 @@
object *value (symbol_t n, object *env) {
while (env != NULL) {
object *pair = car(env);
if (pair != NULL && car(pair)->name == n) return pair;
if (pair != NULL && getname(car(pair)) == n) return pair;
env = cdr(env);
}
return nil;
}

bool boundp (object *var, object *env) {
symbol_t varname = var->name;
symbol_t varname = getname(var);
if (value(varname, env) != NULL) return true;
if (value(varname, GlobalEnv) != NULL) return true;
return false;
}

object *findvalue (object *var, object *env) {
symbol_t varname = var->name;
symbol_t varname = getname(var);
object *pair = value(varname, env);
if (pair == NULL) pair = value(varname, GlobalEnv);
if (pair == NULL) error(0, PSTR("unknown variable"), var);
Expand Down Expand Up @@ -55,7 +55,7 @@ object *closure (int tc, symbol_t name, object *state, object *function, object
while (params != NULL) {
object *value;
object *var = first(params);
if (symbolp(var) && var->name == OPTIONAL) optional = true;
if (symbolp(var) && getname(var) == OPTIONAL) optional = true;
else {
if (consp(var)) {
if (!optional) error(name, PSTR("invalid default value"), var);
Expand All @@ -65,7 +65,7 @@ object *closure (int tc, symbol_t name, object *state, object *function, object
if (!symbolp(var)) error(name, PSTR("illegal optional parameter"), var);
} else if (!symbolp(var)) {
error2(name, PSTR("illegal function parameter"));
} else if (var->name == AMPREST) {
} else if (getname(var) == AMPREST) {
params = cdr(params);
var = first(params);
value = args;
Expand All @@ -90,7 +90,7 @@ object *closure (int tc, symbol_t name, object *state, object *function, object

object *apply (symbol_t name, object *function, object *args, object *env) {
if (symbolp(function)) {
symbol_t fname = function->name;
symbol_t fname = getname(function);
if (fname < ENDKEYWORDS) {
uint8_t callc = getcallc(fname);
if (callc < 0x80) { // High bit not set, so normal function.
Expand Down
1 change: 0 additions & 1 deletion sections/compactimage.c
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,6 @@ void movepointer (object *from, object *to) {
}

uintptr_t compactimage (object **arg) {
markobject(tee);
markobject(GlobalEnv);
markobject(GCStack);
object *firstfree = Workspace;
Expand Down
1 change: 0 additions & 1 deletion sections/esp/setup.c
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,6 @@ void initgfx () {

void initenv () {
GlobalEnv = NULL;
tee = symbol(TEE);
}

void setup () {
Expand Down
Loading

0 comments on commit 77fb98a

Please sign in to comment.