-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rfc: expand partitioning concepts #58440
Conversation
413ee90
to
aa480db
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note at time of writing I've skimmed this -- I'm on board with this proposal's benefits but I think it encounters additional work that will sacrifice efforts in other parts of multiregion in v21.1. In particular, I'm trying to think of the actual code that needs changing, and the amount of fallout that will come out as a result.
I'm not familiar with everything so I'll label out what I'm not sure about / this is the work I think is required:
- Requires changing of UPDATE / INSERT code to be table-level partitioning aware when encoding PKs. Probably some execution stuff with regards to reading PKs/keys back out. Some work with ALTER PRIMARY KEY as well.
- Optimiser changes which I'm not familiar with, but judging by UNIQUE WITHOUT INDEX work that may be trivial since it's already done, we're just rebranding it with this RFC. Also not quite sure how teaching the execution side about ignoring these implicit region column would work.
- Unknown unknowns, which is likely to fall out. I'm also not confident of the existing safeguards we have on validating these new conditions (i.e. randomised tests / environments) which makes me a little more nervous. Comparatively, the adding-column-in-front-of-each-index technique we have pencilled in today is more predictable.
I've tried to justify these changes by also seeing if it reduces work elsewhere. On the surface, it also does not reduce the amount of work required to change from/to a REGIONAL BY ROW table, which is another big unknown we have right now. We would still have to rewrite individual rows.
I'm a little pessimistic given the remaining time til stability of getting this proposal and all the bells and whistles of REGIONAL BY ROW done. So it feels we'll need to make sacrifices to the multiregion effort to get there. Or we raise an army of developers in the next couple of weeks, though it may be too late anyway.
I'm also curious if we can go through our current approach at the moment and invisibly "migrate" to this approach in v21.2. Though again it's a lot of design to suss out in the next couple of weeks to make sure.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @andreimatei)
docs/RFCS/20210104_table_level_partitioning_and_regional_by_row_spec.md, line 632 at r1 (raw file):
- Should we retro-fit the PK-partitioning syntax to mean table-level partitioning? - What's the name of the column is created auto-magically by `CREATE TABLE..
with this i'd be in on board with region
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm still on board with this, though I mirror Oliver's concern about it being light on implementation specifics (by no fault of its own).
Who else do you think should look at this? @knz has already reviewed the earlier doc and it sounds like he's 👍 on the ideas.
Reviewed 1 of 1 files at r1.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @andreimatei)
docs/RFCS/20210104_table_level_partitioning_and_regional_by_row_spec.md, line 23 at r1 (raw file):
This all started when trying to specify the PK of row-level regional tables. The fact that these tables are partitioned (in particular, their primary index is partition) couple with the fact that we want to optionally hide the partitioning
"partitioned) coupled"
docs/RFCS/20210104_table_level_partitioning_and_regional_by_row_spec.md, line 64 at r1 (raw file):
order). As spelled out later, this has a role in refusing to create non-partitioned indexes in some tables, and also in deciding what index gets created automatically for unique constraints on such tables.
It also eases the backward incompatibility concern by failing loudly instead of silently when a user mixes table-level partitioning and secondary-index level partitioning.
docs/RFCS/20210104_table_level_partitioning_and_regional_by_row_spec.md, line 194 at r1 (raw file):
want tight control over your rows’ homing. There’s another implication of a PK, that similarly needs to be clarified. The
This is another interesting case where which exact columns are in the PK makes an important difference. I'm glad you found it.
docs/RFCS/20210104_table_level_partitioning_and_regional_by_row_spec.md, line 231 at r1 (raw file):
versus “hidden from `SELECT *`” concepts, and it would also cause some references to fail. What to do? Should we strive to pretend that the PK is the one the user indicated, or should we expose `crdb_region` more?
This seems like the wrong question to conclude on. Wasn't the entire point of the last section to drive home the idea that adding crdb_region to the PK changes DML semantics and so it, therefore, should not be part of the PK? So why phrase it as "pretend that the PK is the one that the user indicated"? Doing so isn't pretending.
docs/RFCS/20210104_table_level_partitioning_and_regional_by_row_spec.md, line 309 at r1 (raw file):
We’d introduce three improvements here: 1. Don’t require partitioning columns to be explicitly included in an index.
One question that implicit partitioning columns brings up is how to deal with columns that already exist in the index, but not in the prefix. For instance, INDEX idx (x, region) PARTITION BY LIST (region)
. Currently, we don't allow an index to contain the same column twice. We have a few options here:
- we can reject this and say that the partitioning columns don't need to exist in the index, but that if they do, they must be in the prefix
- we can move the
region
column up to the prefix position. So the physical index would look like(region, x)
- we could lift the restriction on an index containing the same column multiple times. PG doesn't have this restriction. So the physical index would look like
(region [hidden], x, region)
My preference would be option 3, but that would break code that assumes that indexes contain a column only once. I don't know whether such code exists, but I suspect that we have at least one case where we generate an inverted index over each of an index's ordinal positions.
EDIT: you discuss this below. It would be worth mentioning that we could lift this restriction now or in the future.
docs/RFCS/20210104_table_level_partitioning_and_regional_by_row_spec.md, line 416 at r1 (raw file):
We could maintain the ability to partition only the PK by allowing the `PARTITION BY` clause on the `PRIMARY KEY` specification.
Do we already allow this?
docs/RFCS/20210104_table_level_partitioning_and_regional_by_row_spec.md, line 419 at r1 (raw file):
mistake; it's entirely confusing that indexes are not automatically partitioned by the PK-partitioning clause which currently appears as a table-level construct in the syntax.
We've seen this cause a number of support issues as a result.
docs/RFCS/20210104_table_level_partitioning_and_regional_by_row_spec.md, line 631 at r1 (raw file):
- Should we retro-fit the PK-partitioning syntax to mean table-level partitioning?
I'm leaning towards yes, but we'd need to think through the migration. How does this change impact existing DDL? How does this change impact existing databases with partitioned tables?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @andreimatei)
docs/RFCS/20210104_table_level_partitioning_and_regional_by_row_spec.md, line 406 at r1 (raw file):
PARTITION ALL BY LIST(region) …The keyword
ALL
would make the partitioning apply to all indexes (present and future).
does that mean users cannot specify an index to be partitioned differently? maybe PARTITION DEFAULT BY LIST?
docs/RFCS/20210104_table_level_partitioning_and_regional_by_row_spec.md, line 406 at r1 (raw file): Previously, otan (Oliver Tan) wrote…
(e.g. CREATE TABLE tbl (a int, b int) PRIMARY KEY a PARTITION DEFAULT BY LIST(region), but maybe someone also wants |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really like this proposal a lot. Thank you very much for putting it together and your input-seeking effort in the previous version of this doc. 💯
Oliver:
I've tried to justify these changes by also seeing if it reduces work elsewhere. On the surface, it also does not reduce the amount of work required to change from/to a REGIONAL BY ROW table
It does seriously reduce the amount of customer churn and support headaches. This should not be ignored just because they're not on the engineering plate.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @andreimatei)
docs/RFCS/20210104_table_level_partitioning_and_regional_by_row_spec.md, line 35 at r1 (raw file):
to introduce that concept as a stand-alone feature. We also discuss fixing a dubious choice in our partitioning syntax - that of using syntax that appears to be table-level for partitioning a single index (the PK).
I think you can also mention here that table-level partitioning is absolutely needed once we start targeting data sovereignty -- we need a framework where there's no way to mistakenly forget to partition an index when partitioning is required on the entire table.
docs/RFCS/20210104_table_level_partitioning_and_regional_by_row_spec.md, line 240 at r1 (raw file):
> A primary key constraint indicates that a column, or group of columns, can be used as a unique identifier for rows in the table.
nit: to format a quote properly, every line needs to be prefixed by >
docs/RFCS/20210104_table_level_partitioning_and_regional_by_row_spec.md, line 308 at r1 (raw file):
We’d introduce three improvements here:
nit: please review how this list looks in the rendered HTML -- I think the spacing and indentation through the list make the end-result hard to read.
docs/RFCS/20210104_table_level_partitioning_and_regional_by_row_spec.md, line 561 at r1 (raw file):
not identical, to the computed crdb_region column described in [Extension syntax for controlling the partitioning](https://docs.google.com/document/d/1j4XxQtLJIYIsxmj6m3DkMlUsC1XyVHXN_d32YWSKydM/edit#bookmark=kix.c2otft71wvgg): - When the partitioning column is computed, the user cannot update it in order to
nit: maybe add a space below each -
otherwise the bullet list is not easy to recognize
One way to clarify the implementation cost (and determine whether Oliver concerns are real or imagined) is to do some additional work and spell out what needs to be changed. For this you may need to gather two or three TLs together and hash it out (e.g. in a separate shared doc, as a bullet list of items). Maybe someone from SQL optimizer (Radu?), experience (Oliver?), execution (Jordan?) |
Not disagreeing here, this is all in the context of getting this along with multiregion in 21.1 with <7 weeks left until stability. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Requires changing of UPDATE / INSERT code to be table-level partitioning aware when encoding PKs. Some work with ALTER PRIMARY KEY as well
I'm hoping not much (or anything) about update/insert needs to change. I guess the way I'd try to implement this is to still have all the columns in the index descriptors, but separately have a per-index and per-table list of partitioning columns used for FK references and for unique constraints.
I've tried to justify these changes by also seeing if it reduces work elsewhere.
The key question here is what a coherent alternative is. How else do we get row-level regional tables working? What is their primary key? How do we get FK references to the primary key working? How can one ask for uniqueness of (region,id)
vs the stricter uniqueness of (id)
? How do we get a round-tripable output from SHOW CREATE TABLE
? To a lesser degree, what exactly triggers the optimizer's locality-optimized search?
So let me turn the question back to you - what here is not strictly necessary / what else can we do? Cause a couple of us did try a bunch of things until getting here, and... this is where we ended up.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @andreimatei, @bdarnell, and @rytaft)
We could continue to do the hacky thing of inserting the column name in front of the column definition, but when declaring a FK references a global table, we know the "region" column is implicit during the schema declaration, so we pass that check. When validating the FK on row execution, the optimiser should then be aware of this and do the same global/local checks it does today. (Isn't this what UNIQUE WITHOUT INDEX is for?) For the upsert case, we similarly know from the table descriptor it is regional by row during the conflict check, and again I was assuming the optimiser would then be able to look through each region to find the row. (Same UNIQUE WITHOUT INDEX item) It's not great, but it's "less work" by confining the crappy bits to the optimiser. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could continue to do the hacky thing
What do you mean by "continue to"? Anything we do here is brand new, isn't it?
we know the "region" column is implicit during the schema declaration
So the PK of the table is just (id)
? Would show create
present the crdb_region
at all? If yes, does it roundtrip and would the column also be presented it as being part of the PK? Can one express that the pair (crdb_region, id)
is unique (and so there's no unique constraint on (id) and so fan-outs are not necessary when validating uniqueness for an insertion)?
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @andreimatei, @bdarnell, and @rytaft)
Sorry, I mean the solution we had pre-this RFC. Assuming we use the solution we had in place on the OG spec:
We can allow it to roundtrip in either case , that doesn't seem difficult.
I thought that was what
Would be rewritten as
Which accomplishes the same thing. If we had
This would be referenced using the UNIQUE WITHOUT INDEX rather than the PK. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can allow it to roundtrip in either case , that doesn't seem difficult.
Can you be more precise, please? What are the cases in "either case"? We could do many things, but not all of them equally coherent, so the devil is in the details.
Can one express that the pair (crdb_region, id) is unique (and so there's no unique constraint on (id) and so fan-outs are not necessary when validating uniqueness for an insertion)?
I thought that was what UNIQUE WITH INDEX (id) was for, e.g.
You seem to be telling me how we'd make (id)
be unique, but I'm asking you how one would specify that (id)
by itself is not unique, and only the (crdb_region, id)
is. That's the case when there's no UNIQUE WITHOUT INDEX
constraint needed.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @andreimatei, @bdarnell, and @rytaft)
Something that would help with the discussion is the section in the rfc that explains "what we lose if we do not take action". |
So assuming like We have
For this, it would seems like we need to the same lot of hacking with hiding column names in this case in SHOW CREATE TABLE (and I'm sure other consequences). FK and unique related code needs updated (optimiser/execution effects). Other consequences too? Unless you mean storing them the other way around:
in which case we need to change update/insert code (and I'm sure other places too?).
Whether or not we show the crdb_region column, I would think they both should be "round-trip"able in the sense using REGIONAL BY ROW in the CREATE TABLE syntax fills in any gaps that may be missed and/or ignore any table statements that may have be added.
I missed the subtlety here. We could have I do agree that your solution makes this a lot cleaner for this case, I'm playing devil's advocate here with the existing solution which arguably we have a larger chance of getting out the door in 21.1. how is this easier than gdocs? hard to follow comment threads |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So assuming like CREATE INDEX idx ON tbl(id):
message IndexDescriptor {
repeated string column_names = 4; // crdb_region, id
repeated string partitioned_column_names = N; // id
}
Note that an IndexDescriptor
already knows how many of its prefix columns are partitioning columns - see the Partitioning
field. We'd only need to add information about what columns from the partitioning set were explicitly added to the index, and which ones weren't.
For this, it would seems like we need to the same lot of hacking with hiding column names in this case in SHOW CREATE TABLE (and I'm sure other consequences). FK and unique related code needs updated (optimiser/execution effects). Other consequences too?
What you call "a lot of hacking" I call a generally useful feature. Table-level partitioning is useful independent of row-level regional tables. As the text tries to explain, this is the feature that a bunch of other databases have, and we lack. For example, partitioned uniqueness checks is a generally useful feature that would fall naturally from these partitioned tables. "A lot of hacking" would probably be an accurate description of what we'd be doing if we'd do something custom for row-level regional tables.
Whether or not we show the crdb_region column, I would think they both should be "round-trip"able in the sense using REGIONAL BY ROW in the CREATE TABLE syntax fills in any gaps that may be missed and/or ignore any table statements that may have be added.
Yeah, I agree with this. But personally I need to see it spelled out about what parts are implied and how they can look when they're spelled out explicitly. Cause you might find that when think about what you should allow to be spelled out, you end up with a lot of the things in this RFC - a customizable column name, the ability to customize the default expression for the partitioning column, the ability to make the partitioning column computed, the ability to define the PK as either (id)
or (crdb_region,id)
(with different semantics).
I'm asking you how one would specify that (id) by itself is not unique, and only the (crdb_region, id) is. That's the case when there's no UNIQUE WITHOUT INDEX constraint needed.
I missed the subtlety here. We could have UNIQUE INDEX(crdb_region,id) WITH (ignore_rbr=true) (or some other optional specifier in this field). Or maybe we don't support it, and your workaround is to create a computed column on crdb_region.
Well but isn't it a thing worth supporting in a natural way? I think we could either try to hide the crdb_region
column completely (which would be a bad idea IMO) or, if we don't hide it, I want to be free to use it like any other column - include it or not include it in the PK as I see fit. Trying to walk the line between not hiding it completely, but also having it be somewhat magic, is where I've discovered to be dragons after... trying it repeatedly.
I do agree that your solution makes this a lot cleaner for this case, I'm playing devil's advocate here with the existing solution which arguably we have a larger chance of getting out the door in 21.1.
I think we disagree on the degree to which there is an "existing solution". You can get some code to compile, but that doesn't mean you've got a "solution". For example the "OG spec" is silent at best and contradictory at worst on the topics discussed here. To go full pompous, when we're playing with the SQL language, every decision we make can either box us in or, to the contrary, be then leveraged for untold benefits in the future. An inspiration for me was a feature Rebecca implemented in the optimizer for the shareded index feature: computing the values of computed columns at query optimization time, and using the computed columns in index selection. That's now useful for the locality-optimized search in multi-region. And it's useful because it was built in a general way.
So, as far as 21.1 goes, I'd rather not cut too many corners in the name of shipping at all costs. I personally don't care that much what part of row-level regional tables makes it in. A project at this magnitude deserves the proper amount of thinking. At the limit, I think we can ship just table-level regional and global tables, and it'd be fine.
Now, I also don't care that much what we implement and ship, as long as it doesn't prevent us from doing something that feels more right later - in this particular case, that is table-level partitioning. So, as long as, say, it would seem that we can introduce table-level partitioning in the next release and transparently migrate existing row-level regional tables to that in a transparent way, I'd be happy. I'm not sure how to do that other than build the whole thing, but maybe you do.
Something that would help with the discussion is the section in the rfc that explains "what we lose if we do not take action".
Well but you see, this also assumes that "not taking action" is an option. We need to do something for row-level regional tables. I could invent some strawmen and argue against them but... it's not an easy exercise.
As it pertains to row-level regional tables, the section that reads "This proposal provides a lot of expressivity" lists some of the flexibility that other schemes would perhaps not afford. When it comes to table-level partitioning more broadly, there's other text that says why it'd be a good thing.
So I guess I'd say that "I hear you", but I'm unfortunately not sure what to do concretely.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @andreimatei, @bdarnell, and @rytaft)
Gotcha! So we're still inserting the column in the index descriptor at the end of the day. But we're hiding it from everyone to see when using PARTITION ALL.
I agree with these statements, but I guess the next step is knowing the costs of these and what we're actually doing over the next few weeks. Also ultimately these decisions will be above my powers :P I also think that any solution using the spec implementation (or rather my understanding of it) would require a major migration to move to this solution anyway, and deletion anyway. I rather not implement something we'll scrap quickly also. As for whether this needs to be out 21.1, I'd personally be a fan of REGIONAL BY ROW not being shipped if it means we focus on what this RFC proposes first. Do it once, do it right but do it late |
id INT, | ||
x INT, | ||
PRIMARY KEY (region, id), | ||
INDEX(region, x) PARTITION BY LIST(region) (partition p1 values in (‘us-west’)... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in theory, this isn't limited to just the region ENUM but could be any type like PARTITON BY(int) (or even a set of values like PARTITION BY (int, string)) right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd echo others' statements above that this approach presents a cleaner experience for users and also allows for greater extensibility down the road.
To @nvanbenschoten's point about it being light on implementation details, I'd echo that concern. It would be nice to merge in some of @otan's more recent thoughts on implementation details for this RFC to be more comprehensive.
As for staging, my thinking is that we should see if this fits into 21.1, and at what cost to remaining SQL Experience (and perhaps Optimizer) items. One thing that might help would be to not expose table partitioning directly in 21.1, but only use it for REGIONAL BY ROW
tables. This may result is a bit of strangeness (particularly the fact that SHOW CREATE TABLE will expose the new syntax which isn't otherwise available to tables ("NOT VISIBLE", but at least this doesn't expose the TABLE PARTITIONING internals), but it will help to limit the testing effort.
Other items that we may be able to defer from 21.1 include:
- Allowing the region column to be computed. I know that this severely limits the use of REGIONAL BY ROW tables (e.g. in TPC-C), but at least it allows us to get REGIONAL BY ROW tables out the door in a way that is usable for some use cases.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @andreimatei, @bdarnell, @nvanbenschoten, and @rytaft)
docs/RFCS/20210104_table_level_partitioning_and_regional_by_row_spec.md, line 15 at r1 (raw file):
row-level regional tables in terms of partitioned tables. The RFC also proposes a backwards-incompatible change: reusing the existing
You mention that this is a backwards-incompatible change, but I think it may be possible to make this backwards compatible by exposing the internal syntax PARTITION ALL BY
instead of changing the meaning of the existing syntax PARTITION BY
. While it may be fairly contrived, I could imagine cases where users want only some of their indexes partitioned. I'm not convinced that exposing this new syntax is justified, but I always worry when changing semantics of DDL. Do we have any telemetry data which would indicate if this is a pattern use by some of our customers? If so, analyzing that data could help alleviate any concerns.
The other thing I worry about with this backward incompatible change is that it implies that we'd need to migrate all existing PARTITION BY tables to obey the new semantics. While not only breaking the semantics, this implies a migration hit to populate the tables/indexes with the new meta-data, and also potentially partition some indexes.
docs/RFCS/20210104_table_level_partitioning_and_regional_by_row_spec.md, line 67 at r1 (raw file):
- **Index-level implicit columns.** These are partitioning columns that are part of the (prefix of) an index only because they are partitioning columns for the
Nit: "an" should be in the ()
docs/RFCS/20210104_table_level_partitioning_and_regional_by_row_spec.md, line 68 at r1 (raw file):
- **Index-level implicit columns.** These are partitioning columns that are part of the (prefix of) an index only because they are partitioning columns for the respective index, not because the user has asked for them to be there. This has
respective table?
docs/RFCS/20210104_table_level_partitioning_and_regional_by_row_spec.md, line 70 at r1 (raw file):
respective index, not because the user has asked for them to be there. This has implications for how the index gets presented through introspection, what the semantics of unique indexes are, and what happens when the index partitioning is
index partitioning -> table partitioning? The index partitioning can't be removed on its own, correct?
docs/RFCS/20210104_table_level_partitioning_and_regional_by_row_spec.md, line 100 at r1 (raw file):
2. Don't require unique constraints (and, by extension, the PK) to include all partitioning columns. 3. Introduce table-level partitioning: a set of column that partition all of the
Nit: column -> columns
docs/RFCS/20210104_table_level_partitioning_and_regional_by_row_spec.md, line 309 at r1 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
One question that implicit partitioning columns brings up is how to deal with columns that already exist in the index, but not in the prefix. For instance,
INDEX idx (x, region) PARTITION BY LIST (region)
. Currently, we don't allow an index to contain the same column twice. We have a few options here:
- we can reject this and say that the partitioning columns don't need to exist in the index, but that if they do, they must be in the prefix
- we can move the
region
column up to the prefix position. So the physical index would look like(region, x)
- we could lift the restriction on an index containing the same column multiple times. PG doesn't have this restriction. So the physical index would look like
(region [hidden], x, region)
My preference would be option 3, but that would break code that assumes that indexes contain a column only once. I don't know whether such code exists, but I suspect that we have at least one case where we generate an inverted index over each of an index's ordinal positions.
EDIT: you discuss this below. It would be worth mentioning that we could lift this restriction now or in the future.
What would the plan be until we lift the restriction (which likely isn't to be in 21.1). Do we need to do (2) as an intermediate solution? Do we block the creation of the above?
docs/RFCS/20210104_table_level_partitioning_and_regional_by_row_spec.md, line 422 at r1 (raw file):
A change like this would also be motivated by the fact that, once we have locality-optimized searches, the uses of non-partitioned indexes on otherwise partitioned tables is pretty diminished.
I worry about the migration hit to doing this. See my note near the top.
docs/RFCS/20210104_table_level_partitioning_and_regional_by_row_spec.md, line 489 at r1 (raw file):
> CREATE TABLE t ( id INT NOT NULL, crdb_region ENUM DEFAULT gateway_region() NOT VISIBLE,
To make this round-trip we'd have some validation that there's a single column with ENUM DEFAULT gateway_region()
? If not we'd add one?
docs/RFCS/20210104_table_level_partitioning_and_regional_by_row_spec.md, line 589 at r1 (raw file):
This proposal provides a lot of expressivity: - The ability to choose the name of the partitioning column (and it have it be named region instead of crdb_region by default).
After giving this more thought, and discussing it with @nvanbenschoten and @awoods187, I'm leaning towards not going back to region
and sticking with crdb_region
. I've come to this for the following reasons (not all of which were my invention):
- It makes the code simpler because we can always assume that crdb_region is present on REGIONAL BY ROW tables
- It signals to the customer that this is not a traditional column, and should be investigated before manupulated
- It is easier to search for in the documentation when it comes to performing that investigation
- It isn't much of an uplift for the application developer to use crdb_region instead of region
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @andreimatei, @bdarnell, @knz, @nvanbenschoten, and @rytaft)
docs/RFCS/20210104_table_level_partitioning_and_regional_by_row_spec.md, line 32 at r1 (raw file):
that support partitioning, although the passage of time made it clear that table-level partitioning is the more useful concept. In particular, because row-level regional tables are table-leveled partitioned, we take the opportunity
table-level partitioned*
docs/RFCS/20210104_table_level_partitioning_and_regional_by_row_spec.md, line 35 at r1 (raw file):
Previously, knz (kena) wrote…
I think you can also mention here that table-level partitioning is absolutely needed once we start targeting data sovereignty -- we need a framework where there's no way to mistakenly forget to partition an index when partitioning is required on the entire table.
+1, good insight
docs/RFCS/20210104_table_level_partitioning_and_regional_by_row_spec.md, line 66 at r1 (raw file):
created automatically for unique constraints on such tables. - **Index-level implicit columns.** These are partitioning columns that are part of
Can we call these something less ambiguous? We already have a concept of an "implicit" and "extra" and "hidden" column. Let's call these "Index-level implicit partitioning columns", or something. Sorry for the bikeshed but I think this will save us confusion later.
docs/RFCS/20210104_table_level_partitioning_and_regional_by_row_spec.md, line 97 at r1 (raw file):
A summary of user-level visible changes: 1. Don't require partitioning columns to be explicitly included in an index.
Would this change the original partitioning implementation too?
docs/RFCS/20210104_table_level_partitioning_and_regional_by_row_spec.md, line 340 at r1 (raw file):
allowing both single and double-column foreign keys, but it seems kinda wasteful. 3. The primary key would similarly not need to specify the implicit columns. For
s/implicit/partitioning/
docs/RFCS/20210104_table_level_partitioning_and_regional_by_row_spec.md, line 345 at r1 (raw file):
We would not support placing partitioning columns in any other place than the indexes prefix, and in any other order than the partitioning order. So, the
s/indexes/index's/
docs/RFCS/20210104_table_level_partitioning_and_regional_by_row_spec.md, line 416 at r1 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
We could maintain the ability to partition only the PK by allowing the `PARTITION BY` clause on the `PRIMARY KEY` specification.
Do we already allow this?
I don't like this idea to change the default, especially not if it's in a single release. An alternative I'd suggest is to introduce the new behavior as ALL
as you suggest, make sure there's another way to specify the existing PRIMARY KEY
partitioning besides the top level PARTITION BY
, add a setting that changes the behavior defaulting to the old behavior, and add a warning/notice when users run top-level PARTITION BY
with the setting off that it's being deprecated and they should start using the PRIMARY KEY
-only partitioning method.
Then, we can over time make a transition once we see whether this effort is successful or not.
docs/RFCS/20210104_table_level_partitioning_and_regional_by_row_spec.md, line 631 at r1 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
I'm leaning towards yes, but we'd need to think through the migration. How does this change impact existing DDL? How does this change impact existing databases with partitioned tables?
I don't know, I think this will introduce more problems than it's worth. Thinking through the migration is a big task and it will put a lot of required brakes on this process. There are a lot of customers that use partitioning, so we'd have to consider each of their cases, issue guidance for how to upgrade their tables, etc.
I would prefer new syntax like your PARTITION ALL BY
, I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @andreimatei, @awoods187, @bdarnell, @knz, @nvanbenschoten, and @rytaft)
docs/RFCS/20210104_table_level_partitioning_and_regional_by_row_spec.md, line 589 at r1 (raw file):
It makes the code simpler because we can always assume that crdb_region is present on REGIONAL BY ROW tables
But we can't - at least not according to this proposal. The name of the partitioning column is customizable in the set locality regional by row as <column name>
. Does this change your mind?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajstorm, @andreimatei, @awoods187, @bdarnell, @jordanlewis, @knz, @nvanbenschoten, @otan, and @rytaft)
docs/RFCS/20210104_table_level_partitioning_and_regional_by_row_spec.md, line 509 at r3 (raw file):
UNIQUE (x) [implcitly PARTITIONED] requires the new optimizer feature of separate uniqueness checks for x, correct?
Yeah that's right. But when you say UNIQUE WITHOUT INDEX
and you typeset it like that, it looks like you're asking about the syntax :P. I'll clarify the text.
Btw, there's a separate thread over here about whether the UNIQUE WITHOUT INDEX
syntax should be exposed to users, separately from these partitioned tables. I personally like it, but others apparently not so much.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well I think the partitions "show up" enough I think, in the sense that every index gets a partitions with the same names.
So, if you do show create
and see something like
PARTITION ALL BY LIST (x) (
PARTITION one VALUES IN ((1))
)
then you know that you can do ALTER PARTITION one OF INDEX <idx name>
for any .
To me, that seems clear enough.
What is now bothering me, however, is that the syntax for changing the partition of a primary key is ALTER PARTITION <name> OF TABLE
. This "OF TABLE" would indeed me quite misleading... Ugh, the sins of the past.
What I would suggest, if you like it, is to introduce ALTER PARTITION <name> OF PRIMARY KEY
, and reject ALTER PARTITION OF TABLE
for partitioned tables.
I would also be fine with what you're suggesting - rejecting all ALTER PARTITION OF INDEX
for partitioned tables. Except that I think I would implement it by separately going and editing the respective zone config for every index, not by introducing a new type of inheritance. But that's only because I think the latter would be more work... What do you think?
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajstorm, @andreimatei, @awoods187, @bdarnell, @jordanlewis, @knz, @nvanbenschoten, @otan, and @rytaft)
do we see a use case of wanting a different zone config for indexes vs tables? if so I can buy this approach. it's by far the simplest :P
my preference if we don't see a use case for the above would be rejecting INDEX as you say, but having the inheritance work (instead of having to cascade zone config changes down from the TABLE). alternatively, have a CASCADE ZONE CONFIG style SQL command control this to do so. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
my preference if we don't see a use case for the above would be rejecting INDEX as you say, but having the inheritance work (instead of having to cascade zone config changes down from the TABLE).
+1 for rejecting, until we find a use case that supports this (and for 21.1, my preference would be to reject, as we don't need to do this for REGIONAL BY ROW tables). I'm also good with Andrei's proposal to support this by manually writing zone configs for every index as opposed to relying on KV logic to implement this.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajstorm, @andreimatei, @awoods187, @bdarnell, @jordanlewis, @knz, @nvanbenschoten, @otan, and @rytaft)
docs/RFCS/20210104_table_level_partitioning_and_regional_by_row_spec.md, line 309 at r1 (raw file):
I think that's fine, although it's one case where backwards compatibility would be affected if we go with repurposing the existing syntax to mean table-level partitioning: indexes that used to be valid aren't any longer.
Yeah, I worry about this too. OK, last comment about migration outside of the thread below.
docs/RFCS/20210104_table_level_partitioning_and_regional_by_row_spec.md, line 416 at r1 (raw file):
Previously, andreimatei (Andrei Matei) wrote…
make sure there's another way to specify the existing PRIMARY KEY partitioning besides the top level PARTITION BY
So this part we're in agreement with - that's what the sentece we're commenting on is saying.
The rest, responding in the bottom thread https://reviewable.io/reviews/cockroachdb/cockroach/58440#-MQFQkb6AYn9J3w4Cbby
+1 for liking @jordanlewis's proposal.
docs/RFCS/20210104_table_level_partitioning_and_regional_by_row_spec.md, line 489 at r1 (raw file):
Previously, andreimatei (Andrei Matei) wrote…
The rules I'm imagining are:
- If
AS <col>
is specified, then the table definition must include<col>
, and the column must have the right type, otherwise error.- If
AS <col>
is not specified, thencrdb_region
must not be defined. If it does, error.My original proposal suggested that, if
AS <col>
is specified and<col>
is not defined, then we could create the column with that name (and default expression, etc). But Nathan didn't like it much and I don't care; seems like extra work anyway.As we were discussing, none of these rules says anything about not having other columns with the respective enum type.
makes sense
docs/RFCS/20210104_table_level_partitioning_and_regional_by_row_spec.md, line 589 at r1 (raw file):
@ajstorm generally we want + need AS or another keyword in syntax every time we expect an identifier. This is because otherwise we wouldn't be able to add a new variant or additional option in the syntax later, because it would clash with possible identifiers.
Sorry, @knz my comment wasn't about the syntax, but rather, the ability for the user to specify their own name for the column.
Now that we're back to crdb_region
as the default name for the column, I'm of the opinion that we should not provide the as <column_name>
option in 21.1 and see if anyone complains. My feeling is that (a) the feature will be usable without this customization, and (b) it limits the amount of testing we have to do for 21.1, which is already showing some signs of stress from a delivery perspective.
If customers complain, I'm all in favour of adding this in 21.2 (or beyond).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we'd reject ALTER PARTITION OF INDEX
then we'd also have to reject ALTER INDEX CONFIGURE
, right (including ALTER INDEX t@primary CONFIGURE
)? So neither partitions of indexes, nor the indexes themselves can have zone configs.
I'm a bit concerned because personally I'm not sure of all the places where the mapping between keys and the narrowest zone configs are encoded, and separately the rules for the inheritance. Everything about how we've implemented inheritance is such crap; for example, zone configs for partitions are stored as Subzones
in the parent zone config.
Maybe by know you know these things better than me. I think that if you'd change DecodeKeyIntoZoneIDAndSuffix
to translate index ids to the id of the primary key of the respective table, then GetSubzoneForKeySuffix()
would just work in returning the right subzone for an index partition. Please make sure to add tests to the kvserver/reports
package.
I'm also thinking that the cleanest thing to do might be to store partition zoneconfigs directly in the table zoneconfigs, instead of storing them in the PK's zone config. So we'd basically prohibit both indexes and the PK from having zone configs. This would require generalizing how the subzone_spans
field is interpreted.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajstorm, @andreimatei, @awoods187, @bdarnell, @jordanlewis, @knz, @nvanbenschoten, @otan, and @rytaft)
docs/RFCS/20210104_table_level_partitioning_and_regional_by_row_spec.md, line 589 at r1 (raw file): Previously, ajstorm wrote…
What about providing the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajstorm, @andreimatei, @awoods187, @bdarnell, @jordanlewis, @knz, @nvanbenschoten, @otan, and @rytaft)
docs/RFCS/20210104_table_level_partitioning_and_regional_by_row_spec.md, line 589 at r1 (raw file):
Previously, rytaft (Rebecca Taft) wrote…
What about providing the
AS <computed column expression>
option? Or is that orthogonal to this discussion? I feel like there are many workloads that would be excluded without support forAS <computed column expression>
, and that would be unfortunate.
I would say that's orthogonal, and that we should implement that before we implement as <column_name>
. I agree with you that many workloads would benefit from support for AS <computed column expression>
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW, one thing that just came to my attention is the fact that we already support a way for a user to edit partitions of all the indexes at once: alter partition of index t@* ...
- note the star.
For me this makes the "do nothing and continue to allow individual configuration of each partition in each index" a little bit more palatable - albeit not much.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajstorm, @andreimatei, @awoods187, @bdarnell, @jordanlewis, @knz, @nvanbenschoten, @otan, and @rytaft)
docs/RFCS/20210104_table_level_partitioning_and_regional_by_row_spec.md, line 589 at r1 (raw file):
Previously, ajstorm wrote…
I would say that's orthogonal, and that we should implement that before we implement
as <column_name>
. I agree with you that many workloads would benefit from support forAS <computed column expression>
.
I've discussed with @ajstorm exactly what to do for 21.1, given that we want to allow users to define the crdb_region
column - with its default value, VISIBLE/NOT VISIBLE, and even as a computed column.
There's two possibilities, so we said we'd kick it to @otan for an opinion:
- We could accept literally
AS crdb_region
(so, not any other column name). The rule would be that, if you say this, then you have to have provided a definition for the column. If you don't useAS crdb_region
, then the column is created for you. - Don't accept
AS crdb_region
and, instead, look for a column namedcrdb_region
and, if found don't create another one.
Btw, @rytaft , the fact that you can define the column as computed probably alleviates some of what you were thinking about the need for the general AS <expr>
, right?
docs/RFCS/20210104_table_level_partitioning_and_regional_by_row_spec.md, line 589 at r1 (raw file):
Yea, I think either of those two options would work. I just want to make sure there is some way for users to specify a computed region. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 for rejecting, until we find a use case that supports this (and for 21.1, my preference would be to reject, as we don't need to do this for REGIONAL BY ROW tables).
I agree.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajstorm, @andreimatei, @awoods187, @jordanlewis, @knz, @nvanbenschoten, @otan, and @rytaft)
docs/RFCS/20210104_table_level_partitioning_and_regional_by_row_spec.md, line 631 at r1 (raw file):
Previously, andreimatei (Andrei Matei) wrote…
OK, so let's not change what
CREATE TABLE PARTITION BY... / ALTER TABLE PARTITION BY...
do in the next release. You have to useALL
.
But a release after that (or even further in the future), would you Ben be OK with rejecting table-levelPARTITION BY
without theALL
? I'm surprised, cause that seems like the biggest compatibility breakage to me. It's a loud one, not a more silent/subtle one, but I'm still surprised.
Yes, when we must break backwards compatibility, it's much better to do so loudly than silently (but it's best, of course, to not break back-compat at all).
docs/RFCS/20210104_table_level_partitioning_and_regional_by_row_spec.md, line 567 at r3 (raw file):
Previously, andreimatei (Andrei Matei) wrote…
I realize we already lose information about the original CREATE TABLE statement in various ways
That's the reason - that I don't think it's worth preserving enough information about the original form and teaching
show create
to use it.
A secondary reason is that I, for one, like the column damn it, and I want more people to be aware of it. It's useful to teach people about it in whatever way possible so that they realize they have the option to use it.I think it's true that the alternative you're suggesting would take away the need for NOT VISIBLE. But we wanted
NOT VISIBLE
anyway - #53428
What are you thinking can-of-worms-wise?
The can of worms is just that people will start to use NOT VISIBLE for their own purposes and ask for things like ALTER TABLE statements to turn visibility on and off to control the output of SELECT *
. If we want those things for their own sake as in #53428 then it's fine, it just seems like a lot to take on as an incidental part of partitioning changes.
58923: sql: support NOT VISIBLE in CREATE TABLE for user-defined hidden columns r=lucy-zhang,knz a=Heoric Informs #53428 Informs #58440 Release note (sql change): It is now possible to use the `NOT VISIBLE` qualifier for new column definitions in CREATE TABLE. This causes the column to become 'hidden'. Hiddens columns are not considered when using `*` in SELECT clauses. (Note that CockroachDB already supported hidden columns previously, such as `rowid` which is added automatically when a table definition has no PRIMARY KEY constraint. This change merely adds the opportunity for end-users to define their own hidden columns.) This feature is intended for use in combination with other features related to geo-partitioning introduced in v21.1, to offer more control about how geo-partitioning keys get exposed to client ORMs and other automated tools that inspect the SQL schema. Co-authored-by: leoric <[email protected]>
Maybe this RFC can be merged, even if the corresponding work has not been completed? We have the status "accepted" for that. |
This RFC wants more partitioning features and specifies row-level regional tables on top of partitioned tables. In particular, the RFC introduces the concept of partitioned tables, in contrast to the existing partitioned indexes. Release note: None
23ef316
to
4b4b89d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, merging as is
bors r+
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajstorm, @andreimatei, @awoods187, @jordanlewis, @knz, @nvanbenschoten, @otan, and @rytaft)
Build failed (retrying...): |
Build succeeded: |
This RFC wants more partitioning features and specifies row-level
regional tables on top of partitioned tables.
In particular, the RFC introduces the concept of partitioned tables, in
contrast to the existing partitioned indexes.
Release note: None