Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

opt: Make additional perf improvements #28609

Merged
merged 5 commits into from
Aug 16, 2018
Merged

Conversation

andy-kimball
Copy link
Contributor

Make several more fixes:

  1. Do not qualify column names in metadata, since that
    requires expensive string formatting up-front (also cleanup
    the factoring of this code, which had gotten messy).
  2. Inline Metadata into Memo.
  3. Inline logicalPropsBuilder into the Memo.

Together, these changes improve KV perf from:

Phases/kv-read/OptBuild  18.4µs ± 1%

to:

Phases/kv-read/OptBuild  17.8µs ± 1%

@andy-kimball andy-kimball requested a review from a team as a code owner August 15, 2018 07:58
@andy-kimball andy-kimball requested a review from a team August 15, 2018 07:58
@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Collaborator

@rytaft rytaft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 2 of 2 files at r1, 7 of 7 files at r2, 2 of 2 files at r3, 45 of 45 files at r4, 83 of 83 files at r5.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale)


pkg/sql/opt/bench/out, line 0 at r5 (raw file):
I think this file was added by accident


pkg/sql/opt/exec/execbuilder/testdata/join, line 185 at r4 (raw file):

----
render               ·         ·                  (x, two, plus1)     ·
 │                   render 0  COALESCE(x, x)     ·                   ·

I thought in this case the name would be qualified since x is ambiguous...


pkg/sql/opt/memo/expr_view_format.go, line 546 at r5 (raw file):

// as "a.x + b.x" to "x". It is better to err on the side of not shortening
// than to incorrectly shorten a column name representing an expression.
func isSimpleColumnName(label string) bool {

The comment for this function needs to be updated since now we're only using it to decide whether to wrap an expression in quotation marks. Alternatively, maybe we can just delete it since column names are no longer expressions. (Unnamed columns are now ?column?).


pkg/sql/opt/memo/memo_format.go, line 244 at r4 (raw file):

}

// forEachDependency runs f for each child group of g.

update comment - f doesn't exist anymore

Copy link
Collaborator

@petermattis petermattis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

r2 looks good to me. The rest is beyond me.

Reviewed 2 of 2 files at r1, 7 of 7 files at r2, 2 of 2 files at r3, 45 of 45 files at r4.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained

Inlining logicalPropsBuilder allows us to reuse the keyBuffer when
computing statistics.

Release note: None
Inline Metadata struct into Memo struct, rather than creating
Metadat on the heap. This gives a small perf boost.

Release note: None
Move ExprView formatting-related methods into new expr_view_format.go
file.

Release note: None
The current code builds column labels that are qualified with their owner
table name (e.g. a.x). This has resulted in several problems:

  1. Sometimes we want the unqualified column name, and extracting this
     has resulted in the addition of some awkward parsing code.
  2. Constructing the qualified column names adds measurable time to
     simple queries, even though the only time we use it is for debug
     and explain display. It's better to build qualified names only when
     we need them rather than up-front.

The trouble is that switching to store unqualified column labels has a
big effect on code all over the opt packages. This commit tries to minimize
the impact on tests, to make reviews easier. It contains a hack to keep
output as close as it can to the old output. In the next commit, the hack is
removed, which triggers a major update of tests across opt.

As part of the changes required for reworking column labeling, I refactored
the ExprView and Memo formatting code so that it's more consolidated and
consistent. Formatting state is now always passed via ExprFmtCtx, which has
been expanded to include a scratch buffer as well as the Memo (rather than
just Metadata).

Release note: None
This removes the hack added in the previous commit so that the new
column formatting rules can work. We now only qualify column names if
it was explicitly requested via a formatting flag, or else if not
qualifying would result in ambiguous column names, such as in this
case:

  select a.x, b.x from a, b where a.x=b.x

Release note: None
Copy link
Contributor Author

@andy-kimball andy-kimball left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale)


pkg/sql/opt/exec/execbuilder/testdata/join, line 185 at r4 (raw file):

Previously, rytaft wrote…

I thought in this case the name would be qualified since x is ambiguous...

The execbuilder is not using the qualified names code. It sometimes uses ColumnLabel, if there happen to be at least one synthesized columns in the projection list. But if there are only passthrough columns, it does not even use ColumnLabel. Therefore, I thought it best just to make it consistent for now, in that it will never use qualified names. As an example of a test case that doesn't use qualified names, see this one above in the file:

EXPLAIN (VERBOSE) SELECT * FROM twocolumn AS a JOIN twocolumn AS b ON a.x = 44

I also spoke to Radu, and he's fine with changing this behavior for this edge case, since column names below the top-level are only used for debugging.


pkg/sql/opt/memo/expr_view_format.go, line 546 at r5 (raw file):

Previously, rytaft wrote…

The comment for this function needs to be updated since now we're only using it to decide whether to wrap an expression in quotation marks. Alternatively, maybe we can just delete it since column names are no longer expressions. (Unnamed columns are now ?column?).

I updated the comment, since it's still useful for column names that have spaces, etc.


pkg/sql/opt/memo/memo_format.go, line 244 at r4 (raw file):

Previously, rytaft wrote…

update comment - f doesn't exist anymore

Done.


pkg/sql/opt/bench/out, line at r5 (raw file):

Previously, rytaft wrote…

I think this file was added by accident

Done.

Copy link
Collaborator

@rytaft rytaft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 1 of 114 files at r6, 1 of 7 files at r7, 29 of 44 files at r9, 83 of 83 files at r10.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale)

Copy link
Member

@RaduBerinde RaduBerinde left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 2 stale)

@andy-kimball
Copy link
Contributor Author

bors r+

craig bot pushed a commit that referenced this pull request Aug 16, 2018
23885: kv: evict leaseholder on RPC error r=solongoron a=tschottdorf

This addresses a situation in which we would not evict a stale
leaseholder for a long time. Consider the replicas [s1,s2,s3] and
s1 is down but is the cached leaseholder, while s2 is the actual lease
holder. The RPC layer will try s1, get an RPC error, try s2 and succeed.
Since there is no NotLeaseHolderError involved, the cache would not get
updated, and so every request pays the overhead of trying s1 first.

WIP because needs testing.

Touches #23601.

Release note (bug fix): Improve request routing during node outages.

28609: opt: Make additional perf improvements r=andy-kimball a=andy-kimball

Make several more fixes:

1. Do not qualify column names in metadata, since that
requires expensive string formatting up-front (also cleanup
the factoring of this code, which had gotten messy).
2. Inline Metadata into Memo.
3. Inline logicalPropsBuilder into the Memo.

Together, these changes improve KV perf from:
```
Phases/kv-read/OptBuild  18.4µs ± 1%
```
to:
```
Phases/kv-read/OptBuild  17.8µs ± 1%
```

28661: storage: don't include RHS data in merge trigger r=bdarnell,tschottdorf a=benesch

Now that we require ranges to be collocated during a merge and the RHS
replicas to be up-to-date before the merge commits, we no longer need to
include a snapshot of the RHS in the merge trigger. We know that the
copy of the data that already exists in the local store is perfectly
up-to-date.

So, stop sending the data in the merge trigger.

Release note: None

28689: sqlbase: avoid using SERIAL in system tables r=knz a=knz

Needed for  #28575.

We'll soon want special behavior for SERIAL. We can't afford the
definition of system tables to be subject to a discussion about what
SERIAL means. So this patch ensures system tables don't use SERIAL.

Release note: None

Co-authored-by: Tobias Schottdorf <[email protected]>
Co-authored-by: Andrew Kimball <[email protected]>
Co-authored-by: Nikhil Benesch <[email protected]>
Co-authored-by: Raphael 'kena' Poss <[email protected]>
@craig
Copy link
Contributor

craig bot commented Aug 16, 2018

Build succeeded

@craig craig bot merged commit b5b7340 into cockroachdb:master Aug 16, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants