Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

catalog: add implicit record types for all tables #70100

Merged
merged 3 commits into from
Oct 15, 2021

Conversation

jordanlewis
Copy link
Member

@jordanlewis jordanlewis commented Sep 13, 2021

Fixes #70036.

This commit adds a resolvable type with the same name as all tables. The
type is treated as a tuple of all of the non-hidden, non-inaccessible
column types of the input table, in creation order.

The table record type is implemented with a virtual type descriptor that
doesn't have an embedded protobuf type descriptor at all.

The new record types are not usable in table definitions for now, but
there's nothing theoretical stopping this from happening. They get
serialized with the @<int> format, so it's "just" a matter of teaching
the validators to deal with UDTs sometimes being physical (a real
TypeDescriptor in system.descriptor for the OID - 100000) and sometimes
virtual (a TableDescriptor for that OID).

Release note (sql change): introduce an implicitly-defined type for
every table, which resolves to a tuple type that contains all of the
columns in the table.

@jordanlewis jordanlewis requested review from a team September 13, 2021 03:34
@jordanlewis jordanlewis requested a review from a team as a code owner September 13, 2021 03:34
@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Contributor

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a neat prototype. I want to make sure we aren't painting ourselves into a corner here. I'd like us to actually make a plan for how we deal with the IDs. For example, for the enums we allocate a distinct oid for the array alias.

I also have some concerns with hydration and leasing interactions. I have existing concerns about the array alias hydration being wrong. I'd like to see testing to ensure that when we have a tuple type with an enum type and the version on the enum type changes that we properly detect that the query is stale and same goes for the array alias.

pkg/sql/catalog/typedesc/type_desc.go Outdated Show resolved Hide resolved
@jordanlewis
Copy link
Member Author

I think the difference between this and for enums is that we do actually allocate a real ID for this too. It's the table ID itself.

It's true that the "array aliases" are separately allocated... but the ID of the enum itself is in the descriptor ID space, whereas its OID is in the custom OID space. Here, the ID of the type itself is in the descriptor ID space (it's just the table ID), and it's OID is in the custom OID space as well. I think it's analogous.

I'm realizing that this does not allocate a separate ID for the "array alias" of the table's UDT though. There is no way to solve that without making every new table request 2 new descriptors, which feels wrong. I'd prefer to just fail compatibility with the UDT's array alias. We can see what SQL-E says. Note that we already fail to create that UDT array alias even though we fake compatibility in the pg_type table for the table UDTs themselves.

@ajwerner
Copy link
Contributor

making every new table request 2 new descriptors, which feels wrong.

Hard disagree. It should allocate 3 IDs, one for its table, one for its type and one for its type alias. We need to actually design record types and think through these things rather than merging a PR that did something easy at the time and pay for it for the next N years. Let's think through what the design should be and weigh tradeoffs like we usually do for projects.

@jordanlewis jordanlewis changed the title catalog: add implicit record types for all tables [dnm] catalog: add implicit record types for all tables Sep 13, 2021
@jordanlewis
Copy link
Member Author

As discussed offline, not intending to urgently merge this. We can consider it a prototype for discussion.

I'm not sure if it's necessary to think through persisted record types from end to end to get this in, but maybe we do?

@ajwerner
Copy link
Contributor

I'm not sure if it's necessary to think through persisted record types from end to end to get this in, but maybe we do?

I'm inclined to disagree until I can be convinced that what's happening here is both a good idea and something that fits with where we'd want to go.

I'm also fearful of use of these tuple types in views and all of the column expressions. We should test that heavily.

@jordanlewis
Copy link
Member Author

I'm inclined to disagree until I can be convinced that what's happening here is both a good idea and something that fits with where we'd want to go.

Fair enough. I did think through this approach and didn't see anything that would block any future plans. I can write some paragraphs on alternatives if you think it would help.

I'm also fearful of use of these tuple types in views and all of the column expressions. We should test that heavily.

Agreed. There are some light tests for that in the PR and I'll add some more.

I'd like to see testing to ensure that when we have a tuple type with an enum type and the version on the enum type changes that we properly detect that the query is stale and same goes for the array alias.

Is there a test that's doing something similar I could crib? Any pointers? Agreed this seems like a ripe area to probe and I'm happy to do so.

@ajwerner
Copy link
Contributor

Postgres allocates distinct oids for the record types. This is important at least in part because you can grant privileges to the things separately. Also, the usage and reference rules might be nice to separate.

I see the argument for extreme pragmatism to get compatibility with the tools. In particular, if these things never get serialized anywhere, then maybe it's okay that the implementation is simplistic. Later in the cycle I might be persuaded to take on that debt in the name of progress. Even then, I think we'd want to really carve out code concepts regarding synthetic record types; this here is way implicit in a way that makes me uncomfortable. I'd imagine we'd want a different implementation of the type descriptor interface as opposed to synthesizing an actually bogus type descriptor with a bogus ID. We're also going to have to feel our way around the whole collision at 100000 thing that makes me rather uneasy, especially in a world where creating descriptors becomes as cheap as I think it will be in 22.1.

As it stands, my preference would be to just decide we want to do record types and do them. Even that project has stages. It does lead me to wonder (yet again) whether the namespace table is structured as we would want it to be. I wonder often about creating a new new namespace table that has a relation type in it so that we could find things like types, functions, and indexes more easily. The last thing I'm worried about it allocating IDs for those things.

@postamar
Copy link
Contributor

Completely agree with @ajwerner. Let's do these record types properly if we actually need them:

  1. We're going to want to have them soon enough anyway, might as well do it right the first time around if this isn't urgent.
  2. We're paying down similar tech debt right now with giving the public schema an explicit descriptor and the cost for that is quite high, as it turns out.
  3. Types have enough existing tech debt as it is, see tabledesc: fix the referencing_descriptor_ids field in type descriptors #63161 among others. Implementing record types will force some much-needed cleanups so that we don't end up painting ourselves into a corner..

@jordanlewis
Copy link
Member Author

jordanlewis commented Sep 15, 2021

I pushed an update to this to resolve the outstanding comments from Andrew, add more tests, and remove the issue with collision at 100000 by having the descriptor have the same ID as the table ID.

I'd be fine not merging this, FWIW - I'm learning by doing, and I hope it's not too annoying to talk about different approaches in this way.

I think from a prioritization perspective, if we can make time to do record types end to end somehow in the next release, I'd buy that it'd be better to do that, just as you say. If we can't, maybe there would be a way that this approach could be cleaned up sufficiently to be palatable and not too debt-incurring, by taking some of the suggestions above.

The key thing that makes it even potentially palatable is that these descriptors will never be persisted. If they were persisted, I agree that this would be a very bad idea without a full, end to end design of all record types.

I worry a little bit that it might be very hard to get record types end to end done, because of the idea of doing another new namespace table and another migration. Judging from past experience, that migration could be very heavyweight, difficult and painful... that being said, the lack of it, on its own, is Postgres incompatible of course because we just have the 1 giant namespace and Postgres has 1 per relation kind.

Anyway, just food for thought! Thanks for letting me experiment...

@jordanlewis
Copy link
Member Author

(btw, since I didn't mention this in the PR description, this is all in pursuit of helping resolve #69010)

@jordanlewis
Copy link
Member Author

I've redone the PR to have a "virtual descriptor" with no backing protobuf type desc. For now, it's in a separate commit for ease of reviewing; I'll squash once we get closer to deciding what method we want to use here.

This commit updates the table record type prototype to use a virtual
type descriptor that doesn't have an embedded protobuf type descriptor
at all.

There are a few TODOs that I'd like review input on.

The main decision is whether or not we should be embedding the
catalog.TableDescriptor that the virtual type descriptor is aliasing.
The commit currently does that, but I could see arguments for why we'd
want to denormalize the fields into the virtual type descriptor struct.

Not sure what we want to do philosophically, so leaving it as an open
question.

I'd like to see testing to ensure that when we have a tuple type with an enum type and the version on the enum type changes that we properly detect that the query is stale and same goes for the array alias.

Testing this, it seems like it doesn't even work for ordinary prepared statements against tables with enum types, so I've filed a separate issue for that: #70378

Copy link
Contributor

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks much for working through this! I'll give it a deeper review another time with fresh eyes. On the whole, this feels close.

Reviewed 1 of 6 files at r3.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @jordanlewis)


pkg/sql/catalog/descs/object.go, line 109 at r2 (raw file):

			// created record type for the table that we found.
			if flags.RequireMutable {
				desc = typedesc.CreateMutableFromTableDesc(tableDesc)

Should this return an error if the client requested that the type be mutable here? We do it one level up on the ByID path but I think you have the right info here for this path.


pkg/sql/catalog/typedesc/type_desc.go, line 1027 at r3 (raw file):

// casted to a table type. This is important for Postgres compatibility, which
// demands that every table have such a corresponding type.
func descBuilderFromTableDesc(descriptor catalog.TableDescriptor) TypeDescriptorBuilder {

should we assert that this table descriptor is indeed already hydrated somehow? We really want it to be by the time we're here, no?

Copy link
Member Author

@jordanlewis jordanlewis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @jordanlewis)


pkg/sql/catalog/typedesc/type_desc.go, line 1027 at r3 (raw file):

Previously, ajwerner wrote…

should we assert that this table descriptor is indeed already hydrated somehow? We really want it to be by the time we're here, no?

This method isn't used any longer and I deleted it, but your comment still applies to the new method below. That being said I'm not really sure


pkg/sql/catalog/typedesc/type_desc.go, line 1077 at r3 (raw file):

type VirtualTableRecordType struct {
	// desc is the TableDescriptor that this virtual record type is created from.
	// TODO(jordan): does it make more sense to eagerly pop out all the fields

Thoughts on this from reviewers?


pkg/sql/catalog/typedesc/type_desc.go, line 1085 at r3 (raw file):

}

// TODO(jordan): if we keep the embedded catalog.TableDescriptor, do we want

Thoughts on this from reviewers? There's a ton of "pass through" delegation in this file, so it's certainly fewer LOC to embed, but slightly nervous about bugs from not considering cases if we do that?

Copy link
Member Author

@jordanlewis jordanlewis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @jordanlewis)


pkg/sql/catalog/typedesc/type_desc.go, line 1027 at r3 (raw file):

Previously, jordanlewis (Jordan Lewis) wrote…

This method isn't used any longer and I deleted it, but your comment still applies to the new method below. That being said I'm not really sure

Whoops, unfinished comment - I added the validation.

@jordanlewis jordanlewis force-pushed the implicit-record-types branch 2 times, most recently from 4227647 to a9fac81 Compare September 27, 2021 03:38
Copy link
Contributor

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like this needs some logic test changes to deal with error message changes. This feels pretty close.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @jordanlewis)


pkg/sql/catalog/typedesc/type_desc.go, line 1027 at r6 (raw file):

	// rather than have all of the methods delegate?
	desc catalog.TableDescriptor

I have a slight preference to pop the fields out in the constructor. What do you think?


pkg/sql/catalog/typedesc/type_desc.go, line 1029 at r6 (raw file):

Quoted 12 lines of code…
// VirtualTableRecordType is an implementation of catalog.TypeDescriptor that
// represents a record type for a particular table: meaning, the composite type
// that contains, in order, all of the visible columns for the table.
type VirtualTableRecordType struct {
	// desc is the TableDescriptor that this virtual record type is created from.
	// TODO(jordan): does it make more sense to eagerly pop out all the fields
	// that we care about from this backing TableDescriptor into the struct,
	// rather than have all of the methods delegate?
	desc catalog.TableDescriptor

	typ *types.T
}

nit: pull this struct, its constructor and all of its logic into a separate file


pkg/sql/catalog/typedesc/type_desc.go, line 1034 at r6 (raw file):

Quoted 4 lines of code…
// TODO(jordan): if we keep the embedded catalog.TableDescriptor, do we want
// to just use Go embedding to avoid writing all of these delegate calls? For now,
// I like keeping them explicit, since it forces us to think about the implementations
// separately from the table descriptor itself.

I like this.


pkg/sql/catalog/typedesc/type_desc.go, line 1071 at r6 (raw file):

	// TODO(jordan): we can't drop these virtual types, so is it weird to say
	// that we have drop permission on this thing even if we don't?
	return v.desc.GetPrivileges()

Should we mask this to just the type privileges (which I think is just usage?) Should we instead map select to usage?


pkg/sql/catalog/typedesc/type_desc.go, line 1088 at r6 (raw file):

// Adding implements the Descriptor interface.
func (v VirtualTableRecordType) Adding() bool { panic("can't call adding on VirtualTableRecordType") }

thoughts on adding a method that these string panics delegate to which add info about the descriptor itself and wraps it in an assertion failed error before panicking?


pkg/sql/catalog/typedesc/type_desc.go, line 1246 at r6 (raw file):

// the implicit record type for a table, which has 1 field for every visible
// column in the table.
func CreateVirtualRecordTypeFromTableDesc(

nit: pull the constructor up above the methods.

@jordanlewis jordanlewis force-pushed the implicit-record-types branch 2 times, most recently from e69b3ca to 752faa0 Compare September 28, 2021 14:32
Copy link
Member Author

@jordanlewis jordanlewis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner and @jordanlewis)


pkg/sql/catalog/typedesc/type_desc.go, line 1034 at r6 (raw file):

Previously, ajwerner wrote…
// TODO(jordan): if we keep the embedded catalog.TableDescriptor, do we want
// to just use Go embedding to avoid writing all of these delegate calls? For now,
// I like keeping them explicit, since it forces us to think about the implementations
// separately from the table descriptor itself.

I like this.

Not sure which alternative you are referring to with "this" - I'm guessing you're saying you like the way it is?


pkg/sql/catalog/typedesc/type_desc.go, line 1071 at r6 (raw file):

Previously, ajwerner wrote…

Should we mask this to just the type privileges (which I think is just usage?) Should we instead map select to usage?

Done.


pkg/sql/catalog/typedesc/type_desc.go, line 1088 at r6 (raw file):

Previously, ajwerner wrote…

thoughts on adding a method that these string panics delegate to which add info about the descriptor itself and wraps it in an assertion failed error before panicking?

Done.

Copy link
Member Author

@jordanlewis jordanlewis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner)


pkg/sql/catalog/typedesc/type_desc.go, line 1027 at r6 (raw file):

Previously, ajwerner wrote…

I have a slight preference to pop the fields out in the constructor. What do you think?

I guess it feels duplicative to pop the fields out in a way that slightly scares me - is there a chance of code migrations not hitting the set of fields we pop out, or something? I dunno.

@jordanlewis jordanlewis changed the title [dnm] catalog: add implicit record types for all tables catalog: add implicit record types for all tables Sep 28, 2021
@jordanlewis
Copy link
Member Author

Sending a heads up to @rafiss to please take a look at this, since I know you've been looking at tuple type related things.

@rafiss
Copy link
Collaborator

rafiss commented Oct 1, 2021

just saw that this landed in PostgreSQL 14. probably not relevant for what we want to do right now, but sharing for awareness

Create composite array types for system catalogs (Wenjing Zeng)

User-defined relations have long had composite types associated with them, and also array types over those composite types. System catalogs now do as well. This change also fixes an inconsistency that creating a user-defined table in single-user mode would fail to create a composite array type.

Copy link
Collaborator

@rafiss rafiss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @jordanlewis, and @rafiss)


pkg/sql/catalog/descpb/structured.proto, line 1371 at r13 (raw file):

    // kind of TypeDescriptor is *never* persisted to disk! If you are here,
    // thinking about using or persisting this value, you should *not* do that!
    VIRTUAL_TABLE_RECORD_TYPE = 3;

this reads like it is the "record type for virtual tables"

would it be more accurate to name this TABLE_VIRTUAL_RECORD_TYPE or TABLE_RECORD_VIRTUAL_TYPE?


pkg/sql/catalog/typedesc/virtual_table_record_type.go, line 36 at r13 (raw file):

	typ   *types.T
	privs *descpb.PrivilegeDescriptor

maybe add a comment saying what privileges are added?


pkg/sql/catalog/typedesc/virtual_table_record_type.go, line 38 at r13 (raw file):

	privs *descpb.PrivilegeDescriptor
}

what would we need to handle the array type for this virtual record type?


pkg/sql/logictest/testdata/logic_test/pg_catalog, line 3408 at r13 (raw file):

SELECT oid
FROM pg_catalog.pg_type
WHERE typname = 'source_table'

why use let here instead of combining it into one

select ... FROM pg_type where typname = 'source_table'


pkg/sql/sem/tree/casts.go, line 375 at r13 (raw file):

	// Casts to TupleFamily.
	{from: types.UnknownFamily, to: types.TupleFamily, volatility: VolatilityImmutable},
	{from: types.TupleFamily, to: types.TupleFamily, volatility: VolatilityImmutable},

hm, is it really immutable? does it show up in the postgresql pg_cast table?

what about casting a tuple containing a timestamptz to a tuple containing a string? that can't be immutable right?

Copy link
Collaborator

@rafiss rafiss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @jordanlewis, and @rafiss)


pkg/sql/sem/tree/casts.go, line 1449 at r13 (raw file):

			return ParseDOid(ctx, string(*v), t)
		}
	case types.TupleFamily:

i feel like we'd want handling for casting to arrays of tuples too. or at least tests for it, to make sure the error message is OK

@rafiss rafiss requested a review from otan October 1, 2021 20:40
@rafiss
Copy link
Collaborator

rafiss commented Oct 1, 2021

cc @otan as well (in case you're interested) since the story of getting the right IDs for this stuff was previously explored in #66815

@otan otan removed their request for review October 4, 2021 20:02
Copy link
Contributor

@postamar postamar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found a few little things to complain about but I'm generally happy with how this is turning out.

for i := range tablePrivs.Users {
newPrivs[i].UserProto = tablePrivs.Users[i].UserProto
// An table's record type has USAGE privs if a user has SELECT on the table.
if privilege.SELECT.Mask()&tablePrivs.Users[i].Privileges != 0 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if privilege.SELECT.IsSetIn(tablePrivs.Users[i].Privileges) {

newPrivs[i].UserProto = tablePrivs.Users[i].UserProto
// An table's record type has USAGE privs if a user has SELECT on the table.
if privilege.SELECT.Mask()&tablePrivs.Users[i].Privileges != 0 {
newPrivs[i].Privileges = newPrivs[i].Privileges | privilege.USAGE.Mask()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

newPrivs[i].Privileges = privilege.USAGE.Mask()

// Represents the virtual record type created on behalf of each table. This
// kind of TypeDescriptor is *never* persisted to disk! If you are here,
// thinking about using or persisting this value, you should *not* do that!
VIRTUAL_TABLE_RECORD_TYPE = 3;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't hate it, but "virtual" may be a bit overloaded. What about "transient" instead? or "implicit"?

@@ -513,6 +513,8 @@ func (desc *immutable) ValidateSelf(vea catalog.ValidationErrorAccumulator) {
if desc.GetArrayTypeID() != descpb.InvalidID {
vea.Report(errors.AssertionFailedf("ALIAS type desc has array type ID %d", desc.GetArrayTypeID()))
}
case descpb.TypeDescriptor_VIRTUAL_TABLE_RECORD_TYPE:
vea.Report(errors.AssertionFailedf("cannot persist type descriptor kind %s", desc.Kind.String()))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find this message confusing. For a start we shouldn't assume that validation only happens during descriptor reads and writes. Really if we believe that a type descriptor of this kind should never even be validated in the first place then that's what the message should read, accompanied with an explanation as to its transient nature.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would also prefer for this validation failure to be exercised in a test.

// desc is the TableDescriptor that this virtual record type is created from.
// TODO(jordan): does it make more sense to eagerly pop out all the fields
// that we care about from this backing TableDescriptor into the struct,
// rather than have all of the methods delegate?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My personal answer to that question would be: no, keep on delegating. From the point of view of the lifecycle of a record type descriptor, it's clear that its underlying the table descriptor will never change for the duration of its existence. Plus they're so tightly coupled.

// VirtualTableRecordType is an implementation of catalog.TypeDescriptor that
// represents a record type for a particular table: meaning, the composite type
// that contains, in order, all of the visible columns for the table.
type VirtualTableRecordType struct {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason why this isn't package-private?

func (v VirtualTableRecordType) GetDrainingNames() []descpb.NameInfo {
// TODO(jordan): I don't think we need to return anything here, because this
// descriptor doesn't really "own" any names on top of what the table already
// does.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is correct plus we're getting rid of draining names.


// GetIDClosure implements the TypeDescriptor interface.
func (v VirtualTableRecordType) GetIDClosure() (map[descpb.ID]struct{}, error) {
ret := make(map[descpb.ID]struct{})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should v.GetID() be included in the ID closure? I'm guessing perhaps not, but I'm asking to make sure that this has been considered.


// GetReferencedDescIDs implements the Descriptor interface.
func (v VirtualTableRecordType) GetReferencedDescIDs() (catalog.DescriptorIDSet, error) {
return v.desc.GetReferencedDescIDs()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this matters much yet, but this isn't accurate. Presently this will include sequence references, foreign key references, etc. Instead this should be GetIDClosure() supplemented with v.GetID(), v.GetParentID() and v.GetParentSchemaID().

return AsTypeDescriptor(desc)
if _, ok := desc.(TableDescriptor); ok {
return nil, unimplemented.NewWithIssuef(70099, "table record type %q not allowed in definition",
desc.GetName())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes me uneasy. We should not be assuming that the descriptor holding the reference intended that ID to be to a record type descriptor. It could be the result of just about any kind of descriptor corruption, we don't know.

Unless this serves a purpose I'm not clued in to, I'd prefer to remove this if-block entirely.

@jordanlewis jordanlewis force-pushed the implicit-record-types branch from ed9e867 to 78227ea Compare October 8, 2021 22:14
Copy link
Member Author

@jordanlewis jordanlewis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TFTRs, this is RFAL!

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @postamar, and @rafiss)


pkg/sql/catalog/validate.go, line 401 at r15 (raw file):

Previously, postamar (Marius Posta) wrote…

This makes me uneasy. We should not be assuming that the descriptor holding the reference intended that ID to be to a record type descriptor. It could be the result of just about any kind of descriptor corruption, we don't know.

Unless this serves a purpose I'm not clued in to, I'd prefer to remove this if-block entirely.

This is a good point. It's defensive programming. If we had an issue, and we did have some kind of weird corruption, we'd get this unimplemented error instead of a crash. But, we'd probably prefer a crash here, I agree. Done.


pkg/sql/catalog/descpb/structured.proto, line 1371 at r13 (raw file):

Previously, rafiss (Rafi Shamim) wrote…

this reads like it is the "record type for virtual tables"

would it be more accurate to name this TABLE_VIRTUAL_RECORD_TYPE or TABLE_RECORD_VIRTUAL_TYPE?

I've renamed to TABLE_IMPLICIT_RECORD_TYPE.


pkg/sql/catalog/descpb/structured.proto, line 1371 at r15 (raw file):

Previously, postamar (Marius Posta) wrote…

I don't hate it, but "virtual" may be a bit overloaded. What about "transient" instead? or "implicit"?

I've renamed to TABLE_IMPLICIT_RECORD_TYPE.


pkg/sql/catalog/typedesc/type_desc.go, line 517 at r15 (raw file):

Previously, postamar (Marius Posta) wrote…

I would also prefer for this validation failure to be exercised in a test.

Done.


pkg/sql/catalog/typedesc/virtual_table_record_type.go, line 36 at r13 (raw file):

Previously, rafiss (Rafi Shamim) wrote…

maybe add a comment saying what privileges are added?

Done.


pkg/sql/catalog/typedesc/virtual_table_record_type.go, line 38 at r13 (raw file):

Previously, rafiss (Rafi Shamim) wrote…

what would we need to handle the array type for this virtual record type?

The reason we're not doing this here is that we'd have to allocate a new descriptor ID to keep "safe" from any new schema objects being created.

Let's say we were wanting to keep the array type oids next to the type oids. Then, for a table id 52, for which we have a corresponding type id 100052, we'd need a corresponding array type id 100053, or something, but we have no way to reserve that ID right now. It's a relatively straightforward extension, but I figured we could get to it later since this PR doesn't close any doors that prevent us from getting there.


pkg/sql/catalog/typedesc/virtual_table_record_type.go, line 28 at r15 (raw file):

Previously, postamar (Marius Posta) wrote…

Is there a reason why this isn't package-private?

Yes, we have a couple of type-checks for it in the resolution path.


pkg/sql/catalog/typedesc/virtual_table_record_type.go, line 32 at r15 (raw file):

Previously, postamar (Marius Posta) wrote…

My personal answer to that question would be: no, keep on delegating. From the point of view of the lifecycle of a record type descriptor, it's clear that its underlying the table descriptor will never change for the duration of its existence. Plus they're so tightly coupled.

Done.


pkg/sql/catalog/typedesc/virtual_table_record_type.go, line 81 at r15 (raw file):

Previously, postamar (Marius Posta) wrote…

if privilege.SELECT.IsSetIn(tablePrivs.Users[i].Privileges) {

Done.


pkg/sql/catalog/typedesc/virtual_table_record_type.go, line 82 at r15 (raw file):

Previously, postamar (Marius Posta) wrote…

newPrivs[i].Privileges = privilege.USAGE.Mask()

Done.


pkg/sql/catalog/typedesc/virtual_table_record_type.go, line 124 at r15 (raw file):

Previously, postamar (Marius Posta) wrote…

This is correct plus we're getting rid of draining names.

Done.


pkg/sql/catalog/typedesc/virtual_table_record_type.go, line 178 at r15 (raw file):

Previously, postamar (Marius Posta) wrote…

I don't think this matters much yet, but this isn't accurate. Presently this will include sequence references, foreign key references, etc. Instead this should be GetIDClosure() supplemented with v.GetID(), v.GetParentID() and v.GetParentSchemaID().

Done. I agree it doesn't matter but I'll keep it correct.

On second thought I've made this return an error; it shouldn't be called from anywhere.jkj


pkg/sql/catalog/typedesc/virtual_table_record_type.go, line 238 at r15 (raw file):

Previously, postamar (Marius Posta) wrote…

Should v.GetID() be included in the ID closure? I'm guessing perhaps not, but I'm asking to make sure that this has been considered.

Yes, it should, you're right, done. I think ultimately this doesn't matter much because we cannot store implicit record table types in any actual serialized tables or types, but again, good to get it correct.

On second thought I've made this return an error; it shouldn't be called from anywhere.


pkg/sql/logictest/testdata/logic_test/pg_catalog, line 3408 at r13 (raw file):

Previously, rafiss (Rafi Shamim) wrote…

why use let here instead of combining it into one

select ... FROM pg_type where typname = 'source_table'

The reason for this was that I wanted to exercise the pg_type virtual index as well. Added a comment to explain why it's done this way, as well as another test that's similar just for the non-virtual index case.


pkg/sql/sem/tree/casts.go, line 375 at r13 (raw file):

Previously, rafiss (Rafi Shamim) wrote…

hm, is it really immutable? does it show up in the postgresql pg_cast table?

what about casting a tuple containing a timestamptz to a tuple containing a string? that can't be immutable right?

There's actually no cases for 2249 (record) in pg_cast... I'm not sure what that would mean? Clearly we can cast tuple to tuple...

Nevertheless you're right, I've made it VolatilityStable which I think will fix these issues.


pkg/sql/sem/tree/casts.go, line 1449 at r13 (raw file):

Previously, rafiss (Rafi Shamim) wrote…

i feel like we'd want handling for casting to arrays of tuples too. or at least tests for it, to make sure the error message is OK

Great point! This was actually broken 😳

I've fixed the issue and added some new tests.

@jordanlewis jordanlewis force-pushed the implicit-record-types branch from 78227ea to 1815b67 Compare October 8, 2021 23:43
Copy link
Contributor

@postamar postamar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I have no further complaints.

pkg/sql/pg_catalog.go Outdated Show resolved Hide resolved
@jordanlewis jordanlewis force-pushed the implicit-record-types branch from 1815b67 to 59a1909 Compare October 12, 2021 21:21
Copy link
Contributor

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Just some minor questions and nits from me.

// have the capability of returning a mutable type descriptor for a
// table's implicit record type.
return prefix, nil, pgerror.Newf(pgcode.InsufficientPrivilege,
"cannot modify table record type %q", objectName)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: you could construct a fully qualified name here rather cheaply and then let the tree type do the quoting as necessary.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason not to do that is that the by-id path will return a different error. I'm sort of okay with that. Take it or leave it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a bunch of %q around here, I think we could change them all in a batch rather than do just this one.

// desc is the TableDescriptor that this implicit record type is created from.
desc catalog.TableDescriptor

typ *types.T
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comment about the invariants on this type, namely that it'll be a tuple type that is itself fully hydrated.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Comment on lines +202 to +209
// HydrateTypeInfoWithName implements the TypeDescriptor interface.
func (v TableImplicitRecordType) HydrateTypeInfoWithName(
ctx context.Context, typ *types.T, name *tree.TypeName, res catalog.TypeDescriptorResolver,
) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be doing validation that this hydration call makes sense?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what you mean here, but happy to add some validation if you set me in the right direction.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm surprised we don't have this on the type descriptor implementation. What I mean is checking that TypeIDToOID for the ID of the descriptor is equal to typ.OID.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

case descpb.TypeDescriptor_TABLE_IMPLICIT_RECORD_TYPE:
return nil, pgerror.Newf(
pgcode.DependentObjectsStillExist,
"cannot drop type %q because table %q requires it",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

above on line 57 we have access to the prefix for this descriptor. With it we could construct a fully qualified name for all of these errors. I'm sure this %q usage around here is a vestige of the days when we didn't have easy access to the prefix. Any interest in constructing the name and using it in these errors?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like this is something we could do in a big batch, since there are a ton of these %q's everywhere. I could file an issue.


// It's an entry for the implicit record type created on behalf of each
// table. We have special logic for this case.
if typDesc.GetKind() == descpb.TypeDescriptor_TABLE_IMPLICIT_RECORD_TYPE {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could we get rid of the addPGTypeRowForTable logic now that we have a type descriptor?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be a nice simplification, but the custom logic encodes a bunch of compatibility things that don't seem easy to refactor away judging by test failures. So I'm going to leave as is.

Copy link
Contributor

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @ajwerner, @jordanlewis, @postamar, and @rafiss)

This commit adds a resolvable type with the same name as all tables. The
type is treated as a tuple of all of the non-hidden, non-inaccessible
column types of the input table, in creation order.

The table record type is implemented with a virtual type descriptor that
doesn't have an embedded protobuf type descriptor at all.

The new record types are not usable in table definitions for now, but
there's nothing theoretical stopping this from happening. They get
serialized with the `@<int>` format, so it's "just" a matter of teaching
the validators to deal with UDTs sometimes being physical (a real
TypeDescriptor in system.descriptor for the OID - 100000) and sometimes
virtual (a TableDescriptor for that OID).

Release note (sql change): introduce an implicitly-defined type for
every table, which resolves to a tuple type that contains all of the
columns in the table.
Mechanical change to move the json populate record tests to the
json_builtins file from the json file, for organization purposes.

Release note: None
And fix a small missing error message.

Release note: None
@jordanlewis jordanlewis force-pushed the implicit-record-types branch from 59a1909 to 3e38e62 Compare October 14, 2021 23:51
@jordanlewis
Copy link
Member Author

TFTRs!

bors r+

@craig
Copy link
Contributor

craig bot commented Oct 15, 2021

Build succeeded:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

sql: create named composite type per table
5 participants