Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

importccl: support unique_rowid() as default expression for IMPORT INTO #50922

Merged
merged 1 commit into from
Jul 28, 2020

Conversation

Anzoteh96
Copy link

@Anzoteh96 Anzoteh96 commented Jul 2, 2020

The PR #50295 supports non-targeted columns with constant expression. This PR is a follow up to that in adding support to unique_rowid().

Previously, the only support given to rowid as a default expression is for hidden column, which is a function of timestamp, row number, and source ID (the ID of processor). To accommodate for more usage of unique_rowid(), this PR modifies the unique_rowid function by making unique_rowid as a function of:

  1. timestamp;
  2. row number;
  3. source ID;
  4. the total occurrences of unique_rowid in the table schema;
  5. instances of each unique_rowid within each row.

In addition, this PR also modifies the visitor method #51390 by adding override methods for volatile methods like unique_rowid. Annotations containing the total occurrences of unique_rowid and unique_rowid instances within a row are stored inside evalCtx, which will be read and updated when visitor walks through the default expression at the sanitization stage, and when default expression is evaluated at each row.

Partially addresses #48253

Release note (general change): IMPORT INTO now supports unique_rowid() as a default expression.

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@Anzoteh96 Anzoteh96 force-pushed the import-rowid branch 2 times, most recently from 114a25d to 4e7f8b4 Compare July 2, 2020 15:43
@Anzoteh96 Anzoteh96 changed the title importccl: [WIP] support unique_rowid() as default expression importccl: support unique_rowid() as default expression for IMPORT INTO Jul 2, 2020
@Anzoteh96 Anzoteh96 marked this pull request as ready for review July 2, 2020 15:47
@Anzoteh96 Anzoteh96 requested review from a team, dt, pbardea and miretskiy and removed request for a team July 2, 2020 15:47
if err != nil {
return errors.Wrapf(err, "error evaluating default expression for IMPORT INTO")
if col.DefaultExprStr() == "unique_rowid()" {
if c.hidden < 0 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is quite right? Not all tables into which we'll use IMPORT INTO with unique_rowid() will also be using hidding PK cols, right?

also, I think this would currently be putting the same value in the hidden pk col and in the explicit default col, but that seems wrong -- each call to unique_rowid is supposed to return a different value, even if they're in the same row. Hmm.

Copy link
Author

@Anzoteh96 Anzoteh96 Jul 2, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, thanks for catching that. Turns out I omitted the case where Primary Key is specified in the column specification.

For the second part of your comment, this would be a problem when we have more than a column with default expression being unique_rowid() right? Would it make sense to add a constraint that we only support if there's <= 1 column with unique_rowid?

Edit: just realized that it might be possible to get around by changing (uint64(sourceID) << rowIDBits) ^ uint64(rowIndex) to (uint64(sourceID) << rowIDBits) ^ uint64(c * rowIndex + r) with c being the number of columns with unique_rowid() (possibly including the hidden column) and r=0, 1, ..., c-1 being the actual column positions.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh, I thought I left a comment here... but I guess GitHub swallowed it?

Changing the method to (uint64(sourceID) << rowIDBits) ^ uint64(c * rowIndex + r) would work, but since the rows in a CSV import also add the timestamp to their rowID, it would also increase the rate at which we consume time. Right now, every row consumes 10 microseconds of time - with the modified expression, it would consume c*10 microseconds. Extrapolating from the comment in the code:

100m row file would use 10µs/row or ~17min worth of IDs

Then importing a (10 column x 100m) row file would use nearly 3 hours of IDs. That seems quite expensive... Perhaps we can keep track of only the columns which have a default value of unique_rowid. This would reduce the rate at which we consume keys to u*10 microseconds/row, where u is the number of columns with a uniquie_rowid default value.

But perhaps limiting 1 default unique_rowid per import is reasonable.

Copy link
Author

@Anzoteh96 Anzoteh96 Jul 6, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, this concern makes sense. Maybe roll back to supporting only 1 default column with unique_rowid until we have a better scheme for avoiding collision?

Edit: the updated implementation has the value c being the number of columns with unique_rowid(), including the hidden column but not visible and targeted columns (these columns should use the values from the CSVs instead). As @dt mentioned (below) this should work for wide columns that has (at most) two unique_rowid() columns.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd have a slight preference for counting how many we need as previously suggested, instead of total columns, since a very wide table is pretty common but a wide table of uniquely-generated-otherwise-meaningless integers is not. In practice, I'd guess it is one most often (pk/hidden pk) but wouldn't be surprised by two: implicit pk and then explicit unique rowid (where they probably should have used that as an explicit PK instead but didn't)

@Anzoteh96 Anzoteh96 changed the title importccl: support unique_rowid() as default expression for IMPORT INTO importccl: [WIP again] support unique_rowid() as default expression for IMPORT INTO Jul 4, 2020
@Anzoteh96 Anzoteh96 marked this pull request as draft July 4, 2020 01:05
@Anzoteh96 Anzoteh96 changed the title importccl: [WIP again] support unique_rowid() as default expression for IMPORT INTO importccl: [WIP while addressing reviews] support unique_rowid() as default expression for IMPORT INTO Jul 4, 2020
@Anzoteh96 Anzoteh96 changed the title importccl: [WIP while addressing reviews] support unique_rowid() as default expression for IMPORT INTO importccl: support unique_rowid() as default expression for IMPORT INTO Jul 6, 2020
@Anzoteh96 Anzoteh96 marked this pull request as ready for review July 6, 2020 14:17
@Anzoteh96 Anzoteh96 force-pushed the import-rowid branch 2 times, most recently from 7447d41 to e215947 Compare July 6, 2020 17:14
@Anzoteh96
Copy link
Author

Added TODO to address the potential problem with IDs with multiple unique_rowid default columns. RFAR!

@maddyblue
Copy link
Contributor

@Anzoteh96
Copy link
Author

Anzoteh96 commented Jul 6, 2020

Can you remove the SQL comments from https://github.com/cockroachdb/cockroach/blob/master/pkg/ccl/importccl/testdata/pgdump/geo.sql too?

(Add on to the problem below: it seems like the target columns are not properly populated during the makeInputCoverter stage).

Just found a problem: right now the implementation of import for pgdump seemed to assume that the statements in INSERT INTO has its column in the exact order as the order when they are created. This means non-targeted column, even with default values, wouldn't work as it should be. (Unfortunately, this problem escaped my notice when I implemented IMPORT DEFAULT for constant expressions, because all test cases are in CSV and changing order of columns will still return the results correctly for CSV).

E.g. the test file has
CREATE TABLE "nyc_census_blocks" (
"blkid" varchar(15),
"popn_total" float8,
"popn_white" float8,
"popn_black" float8,
"popn_nativ" float8,
"popn_asian" float8,
"popn_other" float8,
"boroname" varchar(32)
);
SELECT AddGeometryColumn('','nyc_census_blocks','geom','26918','MULTIPOLYGON',2);
INSERT INTO "nyc_census_blocks" ("blkid","popn_total","popn_white","popn_black","popn_nativ","popn_asian","popn_other","boroname",geom) VALUES blah blah blah

Consider, now, changing the column declaration to (basically change the last two)
CREATE TABLE "nyc_census_blocks" (
"blkid" varchar(15),
"popn_total" float8,
"popn_white" float8,
"popn_black" float8,
"popn_nativ" float8,
"popn_asian" float8,
"boroname" varchar(32),
"popn_other" float8
);
with others the same: INSERT INTO still has the order
("blkid","popn_total","popn_white","popn_black","popn_nativ","popn_asian","popn_other","boroname",geom)
This will give the error could not parse "Staten Island" as type float, as it's supposed to be in the boroname column but the implementation thinks it's for the popn_other column.

@Anzoteh96
Copy link
Author

Can you remove the SQL comments from https://github.com/cockroachdb/cockroach/blob/master/pkg/ccl/importccl/testdata/pgdump/geo.sql too?

@mjibson I'm on my way in #51065 to address the problem of pgdump columns not being populated correctly so that the comments in geo.sql can be safely removed. Would you prefer to have this PR done without the pgdump fix (in which case the geospatial team will have to test the new change in CSV, or wait for the pgdump fix to be done before coming back to this PR again?

@maddyblue
Copy link
Contributor

Do whatever makes sense to you. If separate PRs is easier, do that.

@Anzoteh96
Copy link
Author

Can you remove the SQL comments from https://github.com/cockroachdb/cockroach/blob/master/pkg/ccl/importccl/testdata/pgdump/geo.sql too?

Just did that after closing #51065 (along with modification on SHOW CREATE check and checking that the rowids are valid). RFAR!

Copy link
Contributor

@pbardea pbardea left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @Anzoteh96, @miretskiy, and @pbardea)


pkg/ccl/importccl/import_stmt_test.go, line 68 at r2 (raw file):

// This checks that the selected columns of a query string have
// all unique elements. It's useful for checking unique_rowid.
func checkUnique(allStr [][]string, inds []int) bool {

Can we rename the first argument to something like rows or data and the second to colIndexes. It's not immediately clear to me what these are supposed to be without looking at the callsite.


pkg/ccl/importccl/import_stmt_test.go, line 85 at r3 (raw file):

// This checks that the selected columns of a query string have
// no "NULL" elements. It's useful for checking unique_rowid.
func checkNoNull(allStr [][]string, inds []int) bool {

nit: I think containsNoNull may be a better name for this since it doesn't do any actual checking.


pkg/sql/row/row_converter.go, line 201 at r3 (raw file):

	// The rest of these are derived from tableDesc, just cached here.
	rowIDs                []int

This name seems a bit too generic for what it really is. Perhaps defaultUniqueCols would work?


pkg/sql/row/row_converter.go, line 346 at r3 (raw file):

// if necessary.
func (c *DatumRowConverter) Row(ctx context.Context, sourceID int32, rowIndex int64) error {
	for i, pos := range c.rowIDs {

I'd prefer pos to be renamed to something like colIdx.


pkg/sql/row/row_converter.go, line 347 at r3 (raw file):

func (c *DatumRowConverter) Row(ctx context.Context, sourceID int32, rowIndex int64) error {
	for i, pos := range c.rowIDs {
		// We don't want to call unique_rowid() for columns with such default expressions because it

nit: I think that this comment needs wrapping again after updating. The lower bits here aren't strictly the row IDs since they now also depend on the number of non-targeted rows with a unique_rowid default val.


pkg/sql/row/row_converter.go, line 352 at r3 (raw file):

		// low-bits. Instead, make our own IDs that attempt to keep each generator
		// (sourceID) writing to its own key-space with sequential rowIndexes
		// mapping to sequential unique IDs, by putting the rowID in the lower

This comment probably also needs updates, since the lower bits are no longer strictly just the


pkg/sql/row/row_converter.go, line 378 at r3 (raw file):

		// too close to each other. It will therefore be nice to fix this problem.
		avoidCollisionsWithSQLsIDs := uint64(1 << 63)
		shiftedRowIndex := int64(len(c.rowIDs))*rowIndex + int64(i)

A comment explaining this change would be useful.


pkg/sql/row/row_converter.go, line 390 at r3 (raw file):

		col := &c.cols[i]
		if !isTargetCol(i) && !col.Hidden && col.DefaultExpr != nil {
			if !(*col.DefaultExpr == "unique_rowid()" || tree.IsConst(c.EvalCtx, c.defaultExprs[i])) {

Could be bring this out to a helpers that determines if the expr is supported by IMPORT?

Copy link
Contributor

@miretskiy miretskiy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @Anzoteh96 and @miretskiy)


pkg/sql/row/row_converter.go, line 113 at r8 (raw file):

Previously, miretskiy (Yevgeniy Miretskiy) wrote…

One more tiny readability request:

Let's add

useDefaultBuiltin *customFunc

Use this constant instead of nil here, and when walking below...

I meant:

var useDefaultBuiltin *customFunc

@miretskiy miretskiy self-requested a review July 27, 2020 17:08
Copy link
Author

@Anzoteh96 Anzoteh96 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @Anzoteh96, @miretskiy, and @pbardea)


pkg/ccl/importccl/import_stmt_test.go, line 3268 at r7 (raw file):

Previously, pbardea (Paul Bardea) wrote…

Can we add this as a comment in the test?

Done.


pkg/ccl/importccl/import_stmt_test.go, line 4439 at r7 (raw file):

Previously, miretskiy (Yevgeniy Miretskiy) wrote…

what's a gid?

It's a column name of serial type (which basically means unique_rowid() in the pgdump file geo.sql. But anyways, this comment has been removed since we no longer check the validity of gid (as unique_rowid) here (only the comparison of the other corresponding columns, like what the test initially had before this change), because this check has been handled in the TestImportDefault test.


pkg/ccl/importccl/import_stmt_test.go, line 4433 at r8 (raw file):

Previously, pbardea (Paul Bardea) wrote…

nit: "due to that rowid shows up"/"due to rowid showing up"

Done.


pkg/ccl/importccl/read_import_base.go, line 411 at r9 (raw file):

) (*row.DatumRowConverter, error) {
	conv, err := row.NewDatumRowConverter(
		ctx, importCtx.tableDesc, importCtx.targetCols, importCtx.evalCtx, importCtx.kvCh)

Note: copy removed from here.


pkg/ccl/importccl/read_import_pgdump.go, line 511 at r7 (raw file):

Previously, miretskiy (Yevgeniy Miretskiy) wrote…

Yeah -- sorry -- that's exaclty what I meant -- we should make a copy in new datum converter; and cleanup other callers (like read_import_base)

Done. This means:

  1. Creating a copy in the new datum converter;
  2. Remove the extra copy from read_import_base.go.

pkg/sql/row/row_converter.go, line 42 at r7 (raw file):

Previously, miretskiy (Yevgeniy Miretskiy) wrote…

nit: let's not abbreviate? cellInfoAnnotation is just a bit longer.

Done.


pkg/sql/row/row_converter.go, line 45 at r7 (raw file):

Previously, miretskiy (Yevgeniy Miretskiy) wrote…

nit: uniqueRowIDInstance?

Done.


pkg/sql/row/row_converter.go, line 85 at r7 (raw file):

Previously, miretskiy (Yevgeniy Miretskiy) wrote…

Do we need to dereference here? Looks like Annotations has a pointer receive;
I think you should be able to just say evalCtx.Annotations.Get()...?

nit: Also, consider adding a small helper for readability (getCellInfoAnnotation):

Done (and yes we don't have to deference, thanks for catching).


pkg/sql/row/row_converter.go, line 124 at r7 (raw file):

Previously, miretskiy (Yevgeniy Miretskiy) wrote…
			c := annot.Get(cellInfoAddr).(cellInfoAnnot)
			c.uniqueRowIDTotal++
			annot.Set(cellInfoAddr, c)

It would be better if you added a helper function (getCellInfoAnnotation) as I suggest above,
and, instead of storing a struct in the annotation, store the struct pointer (and have the helper return the pointer instead.

So, this code could become:

  getCellInfoAnnotation(annot).uniqueRowIDTotal++

Done.


pkg/sql/row/row_converter.go, line 185 at r7 (raw file):

Previously, pbardea (Paul Bardea) wrote…

Can we also move this comment inside the if statement?

Done.


pkg/sql/row/row_converter.go, line 194 at r7 (raw file):

Previously, miretskiy (Yevgeniy Miretskiy) wrote…

I would probably make side effect optional (if sideEffect != nil )....?

Also, do you think it would make sense to rename it to
"visitorSideEffect" to make it explicit that this is accessed during visit only?

Done. (For both)


pkg/sql/row/row_converter.go, line 216 at r7 (raw file):

Previously, miretskiy (Yevgeniy Miretskiy) wrote…

Lets add

type overrideVolatility bool
const overrideImmutable overrideVolatility = false
const overrideVolatile overrideVolatility = true

And use those constants instead of true/false in this method.

Looks like if we used the three lines and do v.volatile = overrideVolatile the compiler will give an error "cannot use overrideVolatile (type overrideVolatility) as bool, so I've changed that to

const overrideImmutable = false
const overrideVolatile = true

pkg/sql/row/row_converter.go, line 587 at r7 (raw file):

Previously, miretskiy (Yevgeniy Miretskiy) wrote…

note: you won't need to Set again if getCellInfoAnnotation() helper returns *cellInfoAnnot.

Also, let's add a method to the struct:

func (c *cellInfoAnnot) Reset(source, row) {...}

getCellInfoAnnotation(c.EvalCtx).Reset(sourceID, rowIndex)

Done.


pkg/sql/row/row_converter.go, line 113 at r8 (raw file):

Previously, miretskiy (Yevgeniy Miretskiy) wrote…

I meant:

var useDefaultBuiltin *customFunc

Done.


pkg/sql/row/row_converter.go, line 163 at r8 (raw file):

Previously, pbardea (Paul Bardea) wrote…

nit: let's also rename this annotations

Done.


pkg/sql/row/row_converter.go, line 165 at r8 (raw file):

Previously, pbardea (Paul Bardea) wrote…

Can we also add a flag here explaining when we expect this flag to be set?

Done.


pkg/sql/row/row_converter.go, line 469 at r9 (raw file):

		tableDesc: immutDesc,
		KvCh:      kvCh,
		EvalCtx:   evalCtx.Copy(),

Note: copy added here.

Copy link
Author

@Anzoteh96 Anzoteh96 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @Anzoteh96, @miretskiy, and @pbardea)


pkg/sql/row/row_converter.go, line 531 at r7 (raw file):

Previously, miretskiy (Yevgeniy Miretskiy) wrote…
		if col.Hidden {
			if col.DefaultExpr == nil || *col.DefaultExpr != "unique_rowid()" || hidden != -1 {
				return nil, errors.New("unexpected hidden column")
			}
			hidden = i
		}

Do we still need this branch, now that we know how to evaluation unique row id?

This branch is here even before the support for default expressions started. I think the purpose is to check that an import schema has at most one hidden column, and that the hidden column (if any) must have unique_rowid() as default expression. I'm not sure what was the original intention of this part of the code -- let me see if I can do git blame for backtracking.

Copy link
Author

@Anzoteh96 Anzoteh96 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @Anzoteh96, @dt, @miretskiy, @mjibson, and @pbardea)


pkg/sql/row/row_converter.go, line 531 at r7 (raw file):

Previously, Anzoteh96 (Anzo Teh) wrote…

This branch is here even before the support for default expressions started. I think the purpose is to check that an import schema has at most one hidden column, and that the hidden column (if any) must have unique_rowid() as default expression. I'm not sure what was the original intention of this part of the code -- let me see if I can do git blame for backtracking.

So I noticed that the earliest time when this was implemented (later versions were mainly refactoring) was at #22542 when hidden column with primary key was first added for support, so I guess it made sense to only restrict to one hidden column for defensive checking. My take is that this isn't really necessary now: @dt @mjibson what are your thoughts?

TL;DR previously we had the condition check where each table can have at most one hidden column, and that this hidden column must have unique_rowid as default expression. Is this check still necessary, now that we have unique_rowid supported?

Copy link
Contributor

@pbardea pbardea left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @Anzoteh96, @dt, @miretskiy, @mjibson, and @pbardea)


pkg/sql/row/row_converter.go, line 216 at r7 (raw file):

Previously, Anzoteh96 (Anzo Teh) wrote…

Looks like if we used the three lines and do v.volatile = overrideVolatile the compiler will give an error "cannot use overrideVolatile (type overrideVolatility) as bool, so I've changed that to

const overrideImmutable = false
const overrideVolatile = true

It sounds like you need to update the type of v.volatile to be overrideVolatility instead of bool. We'd also probably need to change the !volatile check to volatility == overrideImmutable.


pkg/sql/row/row_converter.go, line 603 at r10 (raw file):

		return ok
	}
	getCellInfoAnnotation(c.EvalCtx.Annotations).Reset(sourceID, rowIndex)

I quite like where this ended up!

@miretskiy miretskiy requested review from dt, maddyblue and pbardea July 27, 2020 19:50
Copy link
Contributor

@miretskiy miretskiy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Few super minor nits, but it's :lgtm: from me. Perhaps wait for @pbardea and @dt to comment as well.

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @Anzoteh96, @dt, @miretskiy, @mjibson, and @pbardea)


pkg/sql/row/row_converter.go, line 216 at r7 (raw file):

Previously, pbardea (Paul Bardea) wrote…

It sounds like you need to update the type of v.volatile to be overrideVolatility instead of bool. We'd also probably need to change the !volatile check to volatility == overrideImmutable.

Yes, that would be nice. "if volatility == overrideImmutable" is more readable than if !volatile


pkg/sql/row/row_converter.go, line 531 at r7 (raw file):

Previously, Anzoteh96 (Anzo Teh) wrote…

So I noticed that the earliest time when this was implemented (later versions were mainly refactoring) was at #22542 when hidden column with primary key was first added for support, so I guess it made sense to only restrict to one hidden column for defensive checking. My take is that this isn't really necessary now: @dt @mjibson what are your thoughts?

TL;DR previously we had the condition check where each table can have at most one hidden column, and that this hidden column must have unique_rowid as default expression. Is this check still necessary, now that we have unique_rowid supported?

I'm pretty sure we can remove this branch; but maybe @dt has a different take.


pkg/sql/row/row_converter.go, line 209 at r10 (raw file):

				// relevant counter (e.g. the total number of occurrences of the
				// unique_rowid function in an expression).
				v.volatile = overrideVolatile

Perhaps rename volatile field to volatility?


pkg/sql/row/row_converter.go, line 566 at r10 (raw file):

				c.defaultCache[i] = typedExpr
				// This default expression isn't volatile, so we can evaluate once here
				// and memoize it.

nit: move the comment inside if statement (and wrap it if needed)

@miretskiy miretskiy self-requested a review July 27, 2020 19:50
Copy link
Contributor

@miretskiy miretskiy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @Anzoteh96, @dt, @miretskiy, @mjibson, and @pbardea)

@miretskiy miretskiy self-requested a review July 27, 2020 19:51
Copy link
Contributor

@pbardea pbardea left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM mod small remaining threads

Reviewed 1 of 1 files at r10.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @Anzoteh96, @dt, @miretskiy, @mjibson, and @pbardea)

Copy link
Author

@Anzoteh96 Anzoteh96 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @Anzoteh96, @dt, @miretskiy, @mjibson, and @pbardea)


pkg/sql/row/row_converter.go, line 216 at r7 (raw file):

Previously, miretskiy (Yevgeniy Miretskiy) wrote…

Yes, that would be nice. "if volatility == overrideImmutable" is more readable than if !volatile

Done. (Also, moved those to the list of constants declared in the file).


pkg/sql/row/row_converter.go, line 209 at r10 (raw file):

Previously, miretskiy (Yevgeniy Miretskiy) wrote…

Perhaps rename volatile field to volatility?

Done.


pkg/sql/row/row_converter.go, line 566 at r10 (raw file):

Previously, miretskiy (Yevgeniy Miretskiy) wrote…

nit: move the comment inside if statement (and wrap it if needed)

Done.


pkg/sql/row/row_converter.go, line 48 at r12 (raw file):

	overrideErrorTerm overrideVolatility = false
	overrideImmutable overrideVolatility = false
	overrideVolatile  overrideVolatility = true

Note: the overrideVolatility constants moved here (see my next comments below for the use of overrideErrorTerm).


pkg/sql/row/row_converter.go, line 256 at r12 (raw file):

		return nil, overrideErrorTerm,
			unsafeExpressionError(err, "type checking error", expr.String())
	}

This SanitizeExprsForImport function has overrideVolatility as one of its returns, so something needs to be returned too even though this SanitizeExprsForImport somehow produces an error. For this reason, the term overrideErrorTerm is used, though it functions like a no-op (just like how the nil typedExpr functions).

Copy link
Contributor

@pbardea pbardea left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm_strong:

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @Anzoteh96, @dt, @miretskiy, @mjibson, and @pbardea)


pkg/sql/row/row_converter.go, line 238 at r13 (raw file):

// SanitizeExprsForImport checks whether default expressions are supported
// for import.
func SanitizeExprsForImport(

Last nit: I don't think this needs to be exported - this should also make CI happy.

…INTO

The PR cockroachdb#50295 supports non-targeted columns with constant expression.
This PR is a follow up to that in adding support to `unique_rowid()`.

Previously, the only support given to rowid as a default expression is for
hidden column, which is a function of timestamp, row number, and source
ID (the ID of processor). To accommodate for more usage of unique_rowid(),
this PR modifies the unique_rowid function by making unique_rowid as a
function of timestamp, row number, source ID, the total occurrences of
unique_rowid in the table schema, and instances of each unique_rowid within
each row.

In addition, this PR also modifies the visitor method cockroachdb#51390 by adding
override methods for volatile methods like unique_rowid. Annotations
containing the total occurrences of unique_rowid and unique_rowid instances
within a row are stored inside evalCtx, which will be read and updated
when visitor walks through the default expression at the sanitization
stage, and when default expression is evaluated at each row.

Partially addresses cockroachdb#48253

Release note (general change): IMPORT INTO now supports `unique_rowid()`
as a default expression.
Copy link
Author

@Anzoteh96 Anzoteh96 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @Anzoteh96, @dt, @miretskiy, @mjibson, and @pbardea)


pkg/sql/row/row_converter.go, line 531 at r7 (raw file):

Previously, miretskiy (Yevgeniy Miretskiy) wrote…

I'm pretty sure we can remove this branch; but maybe @dt has a different take.

I thought about it, and removed this branch and it passed the tests when I tested locally. Let's see what will happen when CI runs it.


pkg/sql/row/row_converter.go, line 238 at r13 (raw file):

Previously, pbardea (Paul Bardea) wrote…

Last nit: I don't think this needs to be exported - this should also make CI happy.

Done.

Copy link
Author

@Anzoteh96 Anzoteh96 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also revised the PR description to talk about overrides.

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @Anzoteh96, @dt, @miretskiy, @mjibson, and @pbardea)

@Anzoteh96
Copy link
Author

Decided that the check for only 1 hidden column in row_converter.go can be omitted (and thus removed from this PR) as it passes CI tests (and because multiple unique_rowid columns are now allowed), so I'm closing this now. TFTR!

bors r+

@craig
Copy link
Contributor

craig bot commented Jul 28, 2020

Build succeeded:

@craig craig bot merged commit 4de665c into cockroachdb:master Jul 28, 2020
@Anzoteh96 Anzoteh96 deleted the import-rowid branch July 28, 2020 21:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants