Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
sql/ir: teach the generator about field packing and primitive types
There are four complementary parts in this patch: - the generator now predefines the common Go primitive types for every input definition file (bool, string, int64, etc.) - the code generator is extended to take a few options on the command line that control its behavior. The main() function is refactored to make it more readable. - the code generator is taught about field packing, i.e. storing multiple small numeric values in a variable of a larger size. See a copy of the explanation below. - the Makefile is modified to use multiple configurations to generate test IR environments. This is used to exercise different combinations of the code generation parameters, to ensure that all of them produce valid code. A copy of the explanatory comment, that outlines the allocation of memory slots for IR struct types, follows. “The interesting part of code generation is fitting structs into nodes. Each [IR] node consists of slots, i.e. spaces in memory where to put values. The goal of slot allocation is to decide which struct field goes to which slot. There are four kinds of slots: numeric, string, references and extra. The numeric, string and reference slots are called "dedicated". Dedicated slots come in finite amount! For example, in the default configuration, there are 2 numeric slots, 1 string slot and 2 reference slots. In general, we prefer a dedicated slot. When dedicated slots are exhausted for a particular type (e.g. when encountering the 3rd numeric field in a struct in the default configuration), we spill to the extra slots. Extra slots expand on demand without limit. We support two modes: packed and unpacked. Understanding unpacked mode can serve as foundation to better understand packed mode. In that mode, each numeric field uses one numeric slot; each reference to a struct uses one reference slot; and each reference to a sum uses both a numeric slot (for the tag) and a reference slot (for the value). Every other type uses an extra slot. When dedicated slots are exhausted, an extra slot is also used. For example: ```go type BinExprValue struct { Left Expr Op BinOp Right Expr } type BinExpr struct { *node } //// Packing with 3 numeric slots: func (x BinExprValue) R(a Allocator) BinExpr { node := a.new() node.nums[0] = numvalslot(x.Left.tag) node.nums[1] = numvalslot(x.Op) node.nums[2] = numvalslot(x.Right.tag) node.refs[0] = x.Left.ref node.refs[1] = x.Right.ref return BinExpr{node} } func (x BinExpr) Left() Expr { return Expr{ExprTag(x.node.nums[0]), x.node.refs[0]} } func (x BinExpr) Op() BinOp { return BinOp(x.ref.nums[1]) } func (x BinExpr) Right() Expr { return Expr{ExprTag(x.node.nums[2]), x.node.refs[1]} } //// Packing with just 2 numeric slots, like in the default configuration: type extraBinExpr struct { Right__Tag ExprTag } func (x BinExprValue) R(a Allocator) BinExpr { ref := a.new() ref.nums[0] = numvalslot(x.Left.tag) ref.nums[1] = numvalslot(x.Op) ref.refs[0] = x.Left.ref ref.refs[1] = x.Right.ref ref.extra = &extraBinExpr{} extra.Right__Tag = x.Right.tag return BinExpr{ref} } func (x BinExpr) Left() Expr { return Expr{ExprTag(x.ref.nums[0]), x.ref.refs[0]} } func (x BinExpr) Op() BinOp { return BinOp(x.ref.nums[1]) } func (x BinExpr) Right() Expr { return Expr{x.ref.extra.(*extraBinExpr).Right__Tag, x.ref.refs[1]} } ``` The general idea of packing dedicated slots until they are exhausted, and then spilling to extra slots, remains. What is different is that the algorithm now tries to fit multiple numeric fields in the same numeric slot, to conserve memory. The algorithm starts with the largest fields first, to reduce fragmentation. This incidentally implies that the fields are not stored in declaration order. For example: ```go //// Observe how all 3 numeric values are now packed in a single slot! func (x BinExprValue) R(a Allocator) BinExpr { ref := a.new() ref.nums[0] = (ref.nums[0] &^ (BinExpr_Slot_Left__Tag_ValueMask << BinExpr_Slot_Left__Tag_BitOffset)) | (numvalslot(x.Left.tag) << BinExpr_Slot_Left__Tag_BitOffset) ref.nums[0] = (ref.nums[0] &^ (BinExpr_Slot_Op_ValueMask << BinExpr_Slot_Op_BitOffset)) | (numvalslot(x.Op) << BinExpr_Slot_Op_BitOffset) ref.nums[0] = (ref.nums[0] &^ (BinExpr_Slot_Right__Tag_ValueMask << BinExpr_Slot_Right__Tag_BitOffset)) | (numvalslot(x.Right.tag) << BinExpr_Slot_Right__Tag_BitOffset) ref.refs[0] = x.Left.ref ref.refs[1] = x.Right.ref return BinExpr{ref} } //// Note: the size in bits for sum types is computed automatically //// depending on the number of variants. const BinExpr_Slot_Left__Tag_BitOffset = 0 const BinExpr_Slot_Left__Tag_ValueMask = 0x3 const BinExpr_Slot_Op_BitOffset = 2 const BinExpr_Slot_Op_ValueMask = 0x3 const BinExpr_Slot_Right__Tag_BitOffset = 4 const BinExpr_Slot_Right__Tag_ValueMask = 0x3 func (x BinExpr) Left() Expr { return Expr{ExprTag((x.ref.nums[0] >> BinExpr_Slot_Left__Tag_BitOffset) & BinExpr_Slot_Left__Tag_ValueMask), x.ref.refs[0]} } func (x BinExpr) Op() BinOp { return BinOp((x.ref.nums[0] >> BinExpr_Slot_Op_BitOffset) & BinExpr_Slot_Op_ValueMask) } func (x BinExpr) Right() Expr { return Expr{ExprTag((x.ref.nums[0] >> BinExpr_Slot_Right__Tag_BitOffset) & BinExpr_Slot_Right__Tag_ValueMask), x.ref.refs[1]} } ``` (see `irgen/codegen/codegen.go` for the rest of the code)
- Loading branch information