-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sql/ir: teach the generator about field packing and primitive types #17758
Conversation
There are groups of files that really need to be reviewed:
I'm naming specific people to avoid inaction due to the bystander effect. Please work with me here: David E is not among us any more, and we need to get each other up to speed with this code somehow. This review should be a smooth learning curve. |
cc @petermattis you may be interested in the packing technique here. Incidental note related to other discussions: I am also working on using these IR data structures to describe logical SQL plans. This will be applied for plan caching, and, I foresee, rule-based rewrites. There the packing techniques are very useful to ensure that we can have more plans cached in memory within the same memory budget. |
d456e95
to
137d54e
Compare
Do you really need both safe and unsafe modes for float values? I think you can use In practice, we'll always be using the packed mode, right? Review status: 0 of 40 files reviewed at latest revision, 1 unresolved discussion. pkg/sql/ir/tests/defcfg/prims/base/base.ir.go, line 56 at r1 (raw file):
You need to make
Comments from Reviewable |
137d54e
to
c626425
Compare
Regarding the comment on unsafe / floats: unfortunately what the Go stdlib has to offer is not sufficient, because it does not enable packing two float32's inside a single uint64 without using (again) the unsafe package. So I'll keep the code for now. Review status: 0 of 40 files reviewed at latest revision, 1 unresolved discussion, some commit checks pending. pkg/sql/ir/tests/defcfg/prims/base/base.ir.go, line 56 at r1 (raw file): Previously, petermattis (Peter Mattis) wrote…
Done. Comments from Reviewable |
c626425
to
84f438a
Compare
https://play.golang.org/p/2p4rGXzKQJ Also, even if the support in the stdlib isn't sufficient, I'm not seeing why you'd want to ever use the safe mode. Review status: 0 of 40 files reviewed at latest revision, 2 unresolved discussions, all commit checks successful. pkg/sql/ir/tests/defcfg/prims/base/base.ir.go, line 57 at r3 (raw file):
You could get rid of both Comments from Reviewable |
Okay I'll do that, just because float32 is rather uncommon.
However, to shave the last hair on this yak, in terms of generated code
this is somehat inferior. Your solution makes the compiler emit a 64-bit
memory load followed by a shift immediately dependent on the load; mine
makes the compiler emit an add (to offset the address) followed by a
32-bit load, where the result of the load may only be used much later.
That's lighter on the cache and easier on the pipeline.
Also, even if the support in the stdlib isn't sufficient, I'm not seeing
why you'd want to ever use the safe mode.
Because Go's manual is somewhat scary about the restrictions on
performing pointer arithmetic in the uintptr type, and if there's ever a
hard-to-find bug in the future, anyone working on this will want the
ability to exclude any doubt about the memory semantics from this code.
a := MakeAllocator()
return &a
}
You could get rid of both |MakeAllocator| and |NewAllocator| by making
the zero-value of |Allocator| useful. The only purpose of
|MakeAllocator| right now is to init |nodes| to 16 entries. That could
be accomplished on the first call to |new| by an additional field
indicating if it was the first call. Not sure if this is worthwhile,
just pointing it out.
Yeah this looks like a good idea. Will look into it.
…--
Raphael 'kena' Poss
|
84f438a
to
5715634
Compare
Okay I removed the unsafe mode by using the math float/int conversions. Regarding NewAllocator/MakeAllocator I wasn't too happy about adding a conditional on the hot path, so I'm leaving that out for now. |
9b7cf03
to
7b12dd5
Compare
Suggested by @petermattis.
There are four complementary parts in this patch: - the generator now predefines the common Go primitive types for every input definition file (bool, string, int64, etc.) - the code generator is extended to take a few options on the command line that control its behavior. The main() function is refactored to make it more readable. - the code generator is taught about field packing, i.e. storing multiple small numeric values in a variable of a larger size. See a copy of the explanation below. - the Makefile is modified to use multiple configurations to generate test IR environments. This is used to exercise different combinations of the code generation parameters, to ensure that all of them produce valid code. A copy of the explanatory comment, that outlines the allocation of memory slots for IR struct types, follows. “The interesting part of code generation is fitting structs into nodes. Each [IR] node consists of slots, i.e. spaces in memory where to put values. The goal of slot allocation is to decide which struct field goes to which slot. There are four kinds of slots: numeric, string, references and extra. The numeric, string and reference slots are called "dedicated". Dedicated slots come in finite amount! For example, in the default configuration, there are 2 numeric slots, 1 string slot and 2 reference slots. In general, we prefer a dedicated slot. When dedicated slots are exhausted for a particular type (e.g. when encountering the 3rd numeric field in a struct in the default configuration), we spill to the extra slots. Extra slots expand on demand without limit. We support two modes: packed and unpacked. Understanding unpacked mode can serve as foundation to better understand packed mode. In that mode, each numeric field uses one numeric slot; each reference to a struct uses one reference slot; and each reference to a sum uses both a numeric slot (for the tag) and a reference slot (for the value). Every other type uses an extra slot. When dedicated slots are exhausted, an extra slot is also used. For example: ```go type BinExprValue struct { Left Expr Op BinOp Right Expr } type BinExpr struct { *node } //// Packing with 3 numeric slots: func (x BinExprValue) R(a Allocator) BinExpr { node := a.new() node.nums[0] = numvalslot(x.Left.tag) node.nums[1] = numvalslot(x.Op) node.nums[2] = numvalslot(x.Right.tag) node.refs[0] = x.Left.ref node.refs[1] = x.Right.ref return BinExpr{node} } func (x BinExpr) Left() Expr { return Expr{ExprTag(x.node.nums[0]), x.node.refs[0]} } func (x BinExpr) Op() BinOp { return BinOp(x.ref.nums[1]) } func (x BinExpr) Right() Expr { return Expr{ExprTag(x.node.nums[2]), x.node.refs[1]} } //// Packing with just 2 numeric slots, like in the default configuration: type extraBinExpr struct { Right__Tag ExprTag } func (x BinExprValue) R(a Allocator) BinExpr { ref := a.new() ref.nums[0] = numvalslot(x.Left.tag) ref.nums[1] = numvalslot(x.Op) ref.refs[0] = x.Left.ref ref.refs[1] = x.Right.ref ref.extra = &extraBinExpr{} extra.Right__Tag = x.Right.tag return BinExpr{ref} } func (x BinExpr) Left() Expr { return Expr{ExprTag(x.ref.nums[0]), x.ref.refs[0]} } func (x BinExpr) Op() BinOp { return BinOp(x.ref.nums[1]) } func (x BinExpr) Right() Expr { return Expr{x.ref.extra.(*extraBinExpr).Right__Tag, x.ref.refs[1]} } ``` The general idea of packing dedicated slots until they are exhausted, and then spilling to extra slots, remains. What is different is that the algorithm now tries to fit multiple numeric fields in the same numeric slot, to conserve memory. The algorithm starts with the largest fields first, to reduce fragmentation. This incidentally implies that the fields are not stored in declaration order. For example: ```go //// Observe how all 3 numeric values are now packed in a single slot! func (x BinExprValue) R(a Allocator) BinExpr { ref := a.new() ref.nums[0] = (ref.nums[0] &^ (BinExpr_Slot_Left__Tag_ValueMask << BinExpr_Slot_Left__Tag_BitOffset)) | (numvalslot(x.Left.tag) << BinExpr_Slot_Left__Tag_BitOffset) ref.nums[0] = (ref.nums[0] &^ (BinExpr_Slot_Op_ValueMask << BinExpr_Slot_Op_BitOffset)) | (numvalslot(x.Op) << BinExpr_Slot_Op_BitOffset) ref.nums[0] = (ref.nums[0] &^ (BinExpr_Slot_Right__Tag_ValueMask << BinExpr_Slot_Right__Tag_BitOffset)) | (numvalslot(x.Right.tag) << BinExpr_Slot_Right__Tag_BitOffset) ref.refs[0] = x.Left.ref ref.refs[1] = x.Right.ref return BinExpr{ref} } //// Note: the size in bits for sum types is computed automatically //// depending on the number of variants. const BinExpr_Slot_Left__Tag_BitOffset = 0 const BinExpr_Slot_Left__Tag_ValueMask = 0x3 const BinExpr_Slot_Op_BitOffset = 2 const BinExpr_Slot_Op_ValueMask = 0x3 const BinExpr_Slot_Right__Tag_BitOffset = 4 const BinExpr_Slot_Right__Tag_ValueMask = 0x3 func (x BinExpr) Left() Expr { return Expr{ExprTag((x.ref.nums[0] >> BinExpr_Slot_Left__Tag_BitOffset) & BinExpr_Slot_Left__Tag_ValueMask), x.ref.refs[0]} } func (x BinExpr) Op() BinOp { return BinOp((x.ref.nums[0] >> BinExpr_Slot_Op_BitOffset) & BinExpr_Slot_Op_ValueMask) } func (x BinExpr) Right() Expr { return Expr{ExprTag((x.ref.nums[0] >> BinExpr_Slot_Right__Tag_BitOffset) & BinExpr_Slot_Right__Tag_ValueMask), x.ref.refs[1]} } ``` (see `irgen/codegen/codegen.go` for the rest of the code)
7b12dd5
to
298beba
Compare
Superseded by #19135. |
Note to reviewers: the patch appears large, but most of it is auto-generated code. Don't be intimidated!
There are four complementary parts in this patch:
the generator now predefines the common Go primitive types for
every input definition file (bool, string, int64, etc.)
the code generator is extended to take a few options on the command
line that control its behavior. The main() function is refactored to
make it more readable.
the code generator is taught about field packing, i.e. storing
multiple small numeric values in a variable of a larger size. See a copy
of the explanation below.
the Makefile is modified to use multiple configurations to generate
test IR environments. This is used to exercise different
combinations of the code generation parameters, to ensure that all
of them produce valid code.
A copy of the explanatory comment, that outlines the allocation of
memory slots for IR struct types, follows.
“The interesting part of code generation is fitting structs into nodes.
Each [IR] node consists of slots, i.e. spaces in memory where to put
values. The goal of slot allocation is to decide which struct field goes
to which slot.
Overview
There are four kinds of slots: numeric, string, references and extra.
The numeric, string and reference slots are called "dedicated".
Dedicated slots come in finite amount!
For example, in the default configuration, there are 2 numeric slots,
1 string slot and 2 reference slots.
In general, we prefer a dedicated slot. When dedicated slots are
exhausted for a particular type (e.g. when encountering the 3rd
numeric field in a struct in the default configuration), we spill
to the extra slots. Extra slots expand on demand without limit.
We support two modes: packed and unpacked.
Understanding unpacked mode can serve as foundation to better
understand packed mode.
How unpacked mode works
In that mode, each numeric field uses one numeric slot; each
reference to a struct uses one reference slot; and each reference
to a sum uses both a numeric slot (for the tag) and a reference
slot (for the value). Every other type uses an extra
slot. When dedicated slots are exhausted, an extra slot is also
used. For example:
How packed mode works
The general idea of packing dedicated slots until they are
exhausted, and then spilling to extra slots, remains. What is
different is that the algorithm now tries to fit multiple numeric
fields in the same numeric slot, to conserve memory. The algorithm
starts with the largest fields first, to reduce fragmentation. This
incidentally implies that the fields are not stored in declaration
order.
For example:
(see
irgen/codegen/codegen.go
for the rest of the code)