Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql/ir: teach the generator about field packing and primitive types #17758

Closed
wants to merge 3 commits into from

Conversation

knz
Copy link
Contributor

@knz knz commented Aug 19, 2017

Note to reviewers: the patch appears large, but most of it is auto-generated code. Don't be intimidated!

There are four complementary parts in this patch:

  • the generator now predefines the common Go primitive types for
    every input definition file (bool, string, int64, etc.)

  • the code generator is extended to take a few options on the command
    line that control its behavior. The main() function is refactored to
    make it more readable.

  • the code generator is taught about field packing, i.e. storing
    multiple small numeric values in a variable of a larger size. See a copy
    of the explanation below.

  • the Makefile is modified to use multiple configurations to generate
    test IR environments. This is used to exercise different
    combinations of the code generation parameters, to ensure that all
    of them produce valid code.

A copy of the explanatory comment, that outlines the allocation of
memory slots for IR struct types, follows.

“The interesting part of code generation is fitting structs into nodes.

Each [IR] node consists of slots, i.e. spaces in memory where to put
values. The goal of slot allocation is to decide which struct field goes
to which slot.

Overview

There are four kinds of slots: numeric, string, references and extra.
The numeric, string and reference slots are called "dedicated".
Dedicated slots come in finite amount!
For example, in the default configuration, there are 2 numeric slots,
1 string slot and 2 reference slots.

In general, we prefer a dedicated slot. When dedicated slots are
exhausted for a particular type (e.g. when encountering the 3rd
numeric field in a struct in the default configuration), we spill
to the extra slots. Extra slots expand on demand without limit.

We support two modes: packed and unpacked.

Understanding unpacked mode can serve as foundation to better
understand packed mode.

How unpacked mode works

In that mode, each numeric field uses one numeric slot; each
reference to a struct uses one reference slot; and each reference
to a sum uses both a numeric slot (for the tag) and a reference
slot (for the value). Every other type uses an extra
slot. When dedicated slots are exhausted, an extra slot is also
used. For example:

type BinExprValue struct {
	Left  Expr
	Op    BinOp
	Right Expr
}
type BinExpr struct { *node }

//// Packing with 3 numeric slots:
func (x BinExprValue) R(a Allocator) BinExpr {
	node := a.new()
	node.nums[0] = numvalslot(x.Left.tag)
	node.nums[1] = numvalslot(x.Op)
	node.nums[2] = numvalslot(x.Right.tag)
	node.refs[0] = x.Left.ref
	node.refs[1] = x.Right.ref
	return BinExpr{node}
}
func (x BinExpr) Left() Expr  { return Expr{ExprTag(x.node.nums[0]), x.node.refs[0]} }
func (x BinExpr) Op() BinOp   { return BinOp(x.ref.nums[1]) }
func (x BinExpr) Right() Expr { return Expr{ExprTag(x.node.nums[2]), x.node.refs[1]} }

//// Packing with just 2 numeric slots, like in the default configuration:
type extraBinExpr struct {
	Right__Tag ExprTag
}
func (x BinExprValue) R(a Allocator) BinExpr {
	ref := a.new()
	ref.nums[0] = numvalslot(x.Left.tag)
	ref.nums[1] = numvalslot(x.Op)
	ref.refs[0] = x.Left.ref
	ref.refs[1] = x.Right.ref
	ref.extra = &extraBinExpr{}
	extra.Right__Tag = x.Right.tag
	return BinExpr{ref}
}
func (x BinExpr) Left() Expr  { return Expr{ExprTag(x.ref.nums[0]), x.ref.refs[0]} }
func (x BinExpr) Op() BinOp   { return BinOp(x.ref.nums[1]) }
func (x BinExpr) Right() Expr { return Expr{x.ref.extra.(*extraBinExpr).Right__Tag, x.ref.refs[1]} }

How packed mode works

The general idea of packing dedicated slots until they are
exhausted, and then spilling to extra slots, remains. What is
different is that the algorithm now tries to fit multiple numeric
fields in the same numeric slot, to conserve memory. The algorithm
starts with the largest fields first, to reduce fragmentation. This
incidentally implies that the fields are not stored in declaration
order.
For example:

//// Observe how all 3 numeric values are now packed in a single slot!
func (x BinExprValue) R(a Allocator) BinExpr {
	ref := a.new()
	ref.nums[0] = (ref.nums[0] &^ (BinExpr_Slot_Left__Tag_ValueMask  << BinExpr_Slot_Left__Tag_BitOffset))  | (numvalslot(x.Left.tag)  << BinExpr_Slot_Left__Tag_BitOffset)
	ref.nums[0] = (ref.nums[0] &^ (BinExpr_Slot_Op_ValueMask         << BinExpr_Slot_Op_BitOffset))         | (numvalslot(x.Op)        << BinExpr_Slot_Op_BitOffset)
	ref.nums[0] = (ref.nums[0] &^ (BinExpr_Slot_Right__Tag_ValueMask << BinExpr_Slot_Right__Tag_BitOffset)) | (numvalslot(x.Right.tag) << BinExpr_Slot_Right__Tag_BitOffset)
	ref.refs[0] = x.Left.ref
	ref.refs[1] = x.Right.ref
	return BinExpr{ref}
}

//// Note: the size in bits for sum types is computed automatically
//// depending on the number of variants.
const BinExpr_Slot_Left__Tag_BitOffset = 0
const BinExpr_Slot_Left__Tag_ValueMask = 0x3
const BinExpr_Slot_Op_BitOffset = 2
const BinExpr_Slot_Op_ValueMask = 0x3
const BinExpr_Slot_Right__Tag_BitOffset = 4
const BinExpr_Slot_Right__Tag_ValueMask = 0x3

func (x BinExpr) Left() Expr {
	return Expr{ExprTag((x.ref.nums[0] >> BinExpr_Slot_Left__Tag_BitOffset) & BinExpr_Slot_Left__Tag_ValueMask), x.ref.refs[0]}
}
func (x BinExpr) Op() BinOp {
	return BinOp((x.ref.nums[0] >> BinExpr_Slot_Op_BitOffset) & BinExpr_Slot_Op_ValueMask)
}
func (x BinExpr) Right() Expr {
	return Expr{ExprTag((x.ref.nums[0] >> BinExpr_Slot_Right__Tag_BitOffset) & BinExpr_Slot_Right__Tag_ValueMask), x.ref.refs[1]}
}

(see irgen/codegen/codegen.go for the rest of the code)

@knz knz requested review from a team August 19, 2017 00:54
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@knz
Copy link
Contributor Author

knz commented Aug 19, 2017

There are groups of files that really need to be reviewed:

  • irgen/main.go: the top-level program. @justinj @m-schneider want to have a go?
  • Makefile: how things are built. Perhaps @benesch you have an opinion?
  • irgen/codegen/codegen.go: the packing algorithm. @jordanlewis you told me you were curious, so you could perhaps have a look? I'd be curious about what @nvanbenschoten thinks too.

I'm naming specific people to avoid inaction due to the bystander effect. Please work with me here: David E is not among us any more, and we need to get each other up to speed with this code somehow. This review should be a smooth learning curve.

@knz
Copy link
Contributor Author

knz commented Aug 19, 2017

cc @petermattis you may be interested in the packing technique here.

Incidental note related to other discussions: I am also working on using these IR data structures to describe logical SQL plans. This will be applied for plan caching, and, I foresee, rule-based rewrites. There the packing techniques are very useful to ensure that we can have more plans cached in memory within the same memory budget.

@knz knz requested review from a team and removed request for a team August 19, 2017 01:07
@knz knz force-pushed the 20170819-ir-types branch 3 times, most recently from d456e95 to 137d54e Compare August 21, 2017 13:05
@petermattis
Copy link
Collaborator

Do you really need both safe and unsafe modes for float values? I think you can use Float64bits and Float64frombits which internally use unsafe, but are blessed by the runtime.

In practice, we'll always be using the packed mode, right?


Review status: 0 of 40 files reviewed at latest revision, 1 unresolved discussion.


pkg/sql/ir/tests/defcfg/prims/base/base.ir.go, line 56 at r1 (raw file):

// R() method, which is type safe.
func (a Allocator) new() *node {
	nodes := *a.nodes

You need to make a.nodes a pointer to a slice in order to be able to mutate the slice. I think it would be more natural and the same number of allocations for these method to be defined on *Allocator and then nodes can be a regular slice (not a pointer to one):

type Allocator struct {
  nodes []node
}

func (a *Allocator) new() *node {
  if len(a.nodes) == 0 {
    a.nodes = make([]node, 256)
  }
  x := &nodes[0]
  a.nodes = a.nodes[1:]
  return x
}

Comments from Reviewable

@CLAassistant
Copy link

CLAassistant commented Aug 21, 2017

CLA assistant check
All committers have signed the CLA.

@knz knz force-pushed the 20170819-ir-types branch from 137d54e to c626425 Compare September 4, 2017 14:21
@knz
Copy link
Contributor Author

knz commented Sep 4, 2017

Regarding the comment on unsafe / floats: unfortunately what the Go stdlib has to offer is not sufficient, because it does not enable packing two float32's inside a single uint64 without using (again) the unsafe package. So I'll keep the code for now.


Review status: 0 of 40 files reviewed at latest revision, 1 unresolved discussion, some commit checks pending.


pkg/sql/ir/tests/defcfg/prims/base/base.ir.go, line 56 at r1 (raw file):

Previously, petermattis (Peter Mattis) wrote…

You need to make a.nodes a pointer to a slice in order to be able to mutate the slice. I think it would be more natural and the same number of allocations for these method to be defined on *Allocator and then nodes can be a regular slice (not a pointer to one):

type Allocator struct {
  nodes []node
}

func (a *Allocator) new() *node {
  if len(a.nodes) == 0 {
    a.nodes = make([]node, 256)
  }
  x := &nodes[0]
  a.nodes = a.nodes[1:]
  return x
}

Done.


Comments from Reviewable

@petermattis
Copy link
Collaborator

Regarding the comment on unsafe / floats: unfortunately what the Go stdlib has to offer is not sufficient, because it does not enable packing two float32's inside a single uint64 without using (again) the unsafe package. So I'll keep the code for now.

https://play.golang.org/p/2p4rGXzKQJ

Also, even if the support in the stdlib isn't sufficient, I'm not seeing why you'd want to ever use the safe mode.


Review status: 0 of 40 files reviewed at latest revision, 2 unresolved discussions, all commit checks successful.


pkg/sql/ir/tests/defcfg/prims/base/base.ir.go, line 57 at r3 (raw file):

	a := MakeAllocator()
	return &a
}

You could get rid of both MakeAllocator and NewAllocator by making the zero-value of Allocator useful. The only purpose of MakeAllocator right now is to init nodes to 16 entries. That could be accomplished on the first call to new by an additional field indicating if it was the first call. Not sure if this is worthwhile, just pointing it out.


Comments from Reviewable

@knz
Copy link
Contributor Author

knz commented Sep 6, 2017 via email

@knz knz force-pushed the 20170819-ir-types branch from 84f438a to 5715634 Compare September 8, 2017 22:19
@knz
Copy link
Contributor Author

knz commented Sep 8, 2017

Okay I removed the unsafe mode by using the math float/int conversions.

Regarding NewAllocator/MakeAllocator I wasn't too happy about adding a conditional on the hot path, so I'm leaving that out for now.

@knz knz force-pushed the 20170819-ir-types branch 2 times, most recently from 9b7cf03 to 7b12dd5 Compare September 9, 2017 15:05
knz added 3 commits September 9, 2017 21:14
There are four complementary parts in this patch:

- the generator now predefines the common Go primitive types for
  every input definition file (bool, string, int64, etc.)

- the code generator is extended to take a few options on the command
  line that control its behavior. The main() function is refactored to
  make it more readable.

- the code generator is taught about field packing, i.e. storing
  multiple small numeric values in a variable of a larger size. See a copy
  of the explanation below.

- the Makefile is modified to use multiple configurations to generate
  test IR environments. This is used to exercise different
  combinations of the code generation parameters, to ensure that all
  of them produce valid code.

A copy of the explanatory comment, that outlines the allocation of
memory slots for IR struct types, follows.

“The interesting part of code generation is fitting structs into nodes.

Each [IR] node consists of slots, i.e. spaces in memory where to put
values. The goal of slot allocation is to decide which struct field goes
to which slot.

There are four kinds of slots: numeric, string, references and extra.
The numeric, string and reference slots are called "dedicated".
Dedicated slots come in finite amount!
For example, in the default configuration, there are 2 numeric slots,
1 string slot and 2 reference slots.

In general, we prefer a dedicated slot. When dedicated slots are
exhausted for a particular type (e.g. when encountering the 3rd
numeric field in a struct in the default configuration), we spill
to the extra slots. Extra slots expand on demand without limit.

We support two modes: packed and unpacked.

Understanding unpacked mode can serve as foundation to better
understand packed mode.

In that mode, each numeric field uses one numeric slot; each
reference to a struct uses one reference slot; and each reference
to a sum uses both a numeric slot (for the tag) and a reference
slot (for the value). Every other type uses an extra
slot. When dedicated slots are exhausted, an extra slot is also
used. For example:

```go
type BinExprValue struct {
	Left  Expr
	Op    BinOp
	Right Expr
}
type BinExpr struct { *node }

//// Packing with 3 numeric slots:
func (x BinExprValue) R(a Allocator) BinExpr {
	node := a.new()
	node.nums[0] = numvalslot(x.Left.tag)
	node.nums[1] = numvalslot(x.Op)
	node.nums[2] = numvalslot(x.Right.tag)
	node.refs[0] = x.Left.ref
	node.refs[1] = x.Right.ref
	return BinExpr{node}
}
func (x BinExpr) Left() Expr  { return Expr{ExprTag(x.node.nums[0]), x.node.refs[0]} }
func (x BinExpr) Op() BinOp   { return BinOp(x.ref.nums[1]) }
func (x BinExpr) Right() Expr { return Expr{ExprTag(x.node.nums[2]), x.node.refs[1]} }

//// Packing with just 2 numeric slots, like in the default configuration:
type extraBinExpr struct {
	Right__Tag ExprTag
}
func (x BinExprValue) R(a Allocator) BinExpr {
	ref := a.new()
	ref.nums[0] = numvalslot(x.Left.tag)
	ref.nums[1] = numvalslot(x.Op)
	ref.refs[0] = x.Left.ref
	ref.refs[1] = x.Right.ref
	ref.extra = &extraBinExpr{}
	extra.Right__Tag = x.Right.tag
	return BinExpr{ref}
}
func (x BinExpr) Left() Expr  { return Expr{ExprTag(x.ref.nums[0]), x.ref.refs[0]} }
func (x BinExpr) Op() BinOp   { return BinOp(x.ref.nums[1]) }
func (x BinExpr) Right() Expr { return Expr{x.ref.extra.(*extraBinExpr).Right__Tag, x.ref.refs[1]} }
```

The general idea of packing dedicated slots until they are
exhausted, and then spilling to extra slots, remains. What is
different is that the algorithm now tries to fit multiple numeric
fields in the same numeric slot, to conserve memory. The algorithm
starts with the largest fields first, to reduce fragmentation. This
incidentally implies that the fields are not stored in declaration
order.
For example:

```go
//// Observe how all 3 numeric values are now packed in a single slot!
func (x BinExprValue) R(a Allocator) BinExpr {
	ref := a.new()
	ref.nums[0] = (ref.nums[0] &^ (BinExpr_Slot_Left__Tag_ValueMask  << BinExpr_Slot_Left__Tag_BitOffset))  | (numvalslot(x.Left.tag)  << BinExpr_Slot_Left__Tag_BitOffset)
	ref.nums[0] = (ref.nums[0] &^ (BinExpr_Slot_Op_ValueMask         << BinExpr_Slot_Op_BitOffset))         | (numvalslot(x.Op)        << BinExpr_Slot_Op_BitOffset)
	ref.nums[0] = (ref.nums[0] &^ (BinExpr_Slot_Right__Tag_ValueMask << BinExpr_Slot_Right__Tag_BitOffset)) | (numvalslot(x.Right.tag) << BinExpr_Slot_Right__Tag_BitOffset)
	ref.refs[0] = x.Left.ref
	ref.refs[1] = x.Right.ref
	return BinExpr{ref}
}

//// Note: the size in bits for sum types is computed automatically
//// depending on the number of variants.
const BinExpr_Slot_Left__Tag_BitOffset = 0
const BinExpr_Slot_Left__Tag_ValueMask = 0x3
const BinExpr_Slot_Op_BitOffset = 2
const BinExpr_Slot_Op_ValueMask = 0x3
const BinExpr_Slot_Right__Tag_BitOffset = 4
const BinExpr_Slot_Right__Tag_ValueMask = 0x3

func (x BinExpr) Left() Expr {
	return Expr{ExprTag((x.ref.nums[0] >> BinExpr_Slot_Left__Tag_BitOffset) & BinExpr_Slot_Left__Tag_ValueMask), x.ref.refs[0]}
}
func (x BinExpr) Op() BinOp {
	return BinOp((x.ref.nums[0] >> BinExpr_Slot_Op_BitOffset) & BinExpr_Slot_Op_ValueMask)
}
func (x BinExpr) Right() Expr {
	return Expr{ExprTag((x.ref.nums[0] >> BinExpr_Slot_Right__Tag_BitOffset) & BinExpr_Slot_Right__Tag_ValueMask), x.ref.refs[1]}
}
```

(see `irgen/codegen/codegen.go` for the rest of the code)
@knz knz force-pushed the 20170819-ir-types branch from 7b12dd5 to 298beba Compare September 9, 2017 19:27
@knz
Copy link
Contributor Author

knz commented Jan 25, 2018

Superseded by #19135.

@knz knz closed this Jan 25, 2018
@knz knz deleted the 20170819-ir-types branch April 27, 2018 18:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants