Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sqlliveness: embed current region into session ID #91694

Merged

Conversation

jaylim-crl
Copy link
Collaborator

@jaylim-crl jaylim-crl commented Nov 10, 2022

Follow up on #90408 and #91019.

Previously, the sqlliveness subsystem was using enum.One when constructing the
session ID since it doesn't have the regional data yet. This commit implements
that missing part of it by ensuring that the regional representation is
plumbed all the way to the sqlliveness subsystem whenever the SQL pod is
started, and use it to construct the session ID. This will enable our REGIONAL
BY ROW work to put the data in the right region.

The multi-region enum for the system database will be read during startup for
that to work. Since doing this already requires loading the system DB's
descriptor (which indirectly tests whether the system DB has been bootstrapped)
for the tenant, we can remove the call to checkTenantExists.

Epic: None

Release note: None

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@jaylim-crl jaylim-crl force-pushed the jay/221109-sql-multiregion-enum branch from b1c1763 to d82a4e1 Compare November 10, 2022 20:15
Copy link
Collaborator Author

@jaylim-crl jaylim-crl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would need to write a test for the DescsTxn work in server_sql.go, but the rest should be here.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained

@jaylim-crl jaylim-crl force-pushed the jay/221109-sql-multiregion-enum branch from b5c2d7d to ada1cb3 Compare November 11, 2022 06:00
@jaylim-crl jaylim-crl marked this pull request as ready for review November 11, 2022 10:38
@jaylim-crl jaylim-crl requested review from a team as code owners November 11, 2022 10:38
@jaylim-crl jaylim-crl requested review from a team and rhu713 and removed request for a team and rhu713 November 11, 2022 10:38
@jaylim-crl jaylim-crl force-pushed the jay/221109-sql-multiregion-enum branch from ada1cb3 to d900b6b Compare November 12, 2022 00:15
Copy link
Contributor

@postamar postamar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just have a couple of high-level comments here, which may be wrong. @ajwerner will no doubt to look at this more closely and have stronger opinions than me :)

pkg/sql/sqlliveness/sqlliveness.go Outdated Show resolved Hide resolved
pkg/server/server_sql.go Outdated Show resolved Hide resolved
Copy link
Contributor

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 5 of 7 files at r1, 6 of 6 files at r4, 3 of 3 files at r5, 3 of 3 files at r6.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @jeffswenson and @postamar)


pkg/server/server_sql.go line 1320 at r7 (raw file):

Previously, postamar (Marius Posta) wrote…

We have a SynthesizeRegionConfig function in sql which does most of this already, would it be possible to use it? You'd have to extend multiregion.RegionConfig to include the physical representation, presumably, but that seems reasonable.

My vote is that we share the parts that are used also by SynthesizeRegionConfig and make this a function in regionutil also.


pkg/sql/sqlliveness/sqlliveness.go line 34 at r7 (raw file):

Previously, postamar (Marius Posta) wrote…

This dependency of catalog on eval is a shameful wart which the SQL Schema team has meant to remove for a while now but still hasn't. Sorry about that!

In any case even if that obstacle weren't in the way, the dependency structure of your change bothers me a bit. I may not have the full picture here but it feels like you're pushing too much high-level logic into SetRegionalData when in fact it would be better off being more "dumb" and instead just take the []bytes for the physical representation of the region, either nakedly or wrapped in some lightweight abstraction.

This could in turn originate from a new method you could add on multiregion.RegionConfig and which you call in the server package. This way no need to shadow catalog.TypeDescriptor at all, which is just as well.

FWIW, I think this narrowing of an interface is totally fine. I'm not sure I'm in love with falling back to the *descpb.TypeDescriptor. I think you could narrow it even further if you wanted. My 2c is you could just as well pass a struct that has the primary region, the process's locality, and the mapping of region to prefix bytes?

@postamar
Copy link
Contributor

I'm not sure I'm in love with falling back to the *descpb.TypeDescriptor

I'd go stronger in that we definitely should not do that! If we want to go further on this path of defining a sqlliveness.TypeDescriptor shadow-interface, the thing to do would be: remove the TypeDesc method and add RegionNames, NumEnumMembers, GetMemberPhysicalRepresentation and GetMemberLogicalRepresentation and implement SetRegionalData using those. It's a bit contrived, but the type descriptor protobuf has no place there.

@jaylim-crl jaylim-crl force-pushed the jay/221109-sql-multiregion-enum branch from d900b6b to b79d155 Compare November 14, 2022 22:39
Copy link
Collaborator Author

@jaylim-crl jaylim-crl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TFTRs! I cleaned up sqlliveness to only take in a []byte slice so no additional dependencies are introduced. The logic of choosing the current region was pushed to the caller as well.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @jeffswenson, and @postamar)


pkg/server/server_sql.go line 1320 at r7 (raw file):

Previously, ajwerner wrote…

My vote is that we share the parts that are used also by SynthesizeRegionConfig and make this a function in regionutil also.

Done. Moved all the main logic to GetRegionEnumRepresentations. Did not extend multiregion.RegionConfig since we have multiple callers that were constructing regional configs, and don't have the physical representations available.


pkg/sql/region_util.go line 1554 at r8 (raw file):

			CommonLookupFlags: tree.CommonLookupFlags{
				AvoidLeased:    !useCache,
				Required:       true,

Note that one additional change here is to specify Required=true when looking up the multi-region enum descriptor, which wasn't done previously. I'd imagine this should be OK.


pkg/sql/sqlliveness/sqlliveness.go line 50 at r1 (raw file):

Previously, JeffSwenson (Jeff Swenson) wrote…

nit: Instead of introducing the SetRegionalData method, the locality information should be an argument to Start. SetRegionalData must be called before Start and the easiest way to ensure that is to fuse the methods into one.

Done.


pkg/sql/sqlliveness/sqlliveness.go line 34 at r7 (raw file):

Previously, ajwerner wrote…

FWIW, I think this narrowing of an interface is totally fine. I'm not sure I'm in love with falling back to the *descpb.TypeDescriptor. I think you could narrow it even further if you wanted. My 2c is you could just as well pass a struct that has the primary region, the process's locality, and the mapping of region to prefix bytes?

Done. Updated it to take in a []byte instead so no additional dependencies are required.

pkg/server/server_sql.go Outdated Show resolved Hide resolved
pkg/server/server_sql.go Outdated Show resolved Hide resolved
pkg/sql/region_util.go Outdated Show resolved Hide resolved
pkg/sql/sqlliveness/slinstance/slinstance.go Outdated Show resolved Hide resolved
Copy link
Contributor

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is getting close.

Reviewed 1 of 2 files at r7, 11 of 12 files at r8, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @jaylim-crl, @jeffswenson, and @postamar)


pkg/sql/region_util.go line 1554 at r8 (raw file):

Previously, jaylim-crl (Jay Lim) wrote…

Note that one additional change here is to specify Required=true when looking up the multi-region enum descriptor, which wasn't done previously. I'd imagine this should be OK.

yeah, for bad reasons, it was implied.

// Required is ignored, and an error is always returned if no descriptor with
// the ID exists.


pkg/sql/sqlliveness/sqlliveness.go line 37 at r8 (raw file):

// consumption.
type Provider interface {
	Start(ctx context.Context, regionPhysicalRep []byte)

Please provide commentary explaining what this is. I even think you could define a type to make it read more cleanly, even if it's a type alias like like type RegionEnumValue = []byte


pkg/sql/sqlliveness/slinstance/slinstance.go line 406 at r8 (raw file):

Previously, jaylim-crl (Jay Lim) wrote…

There was a started bool back then, but has been removed in e1d5835. We should be able to remove the mutex.

// Start runs the hearbeat loop.
func (l *Instance) Start(ctx context.Context) {
	l.mu.Lock()
	defer l.mu.Unlock()
	if l.mu.started {
		return
	}
	log.Infof(ctx, "starting SQL liveness instance")
	_ = l.stopper.RunAsyncTask(ctx, "slinstance", l.heartbeatLoop)
	l.mu.started = true
}

Yeah, I think the mutex is stale

Copy link
Contributor

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @jaylim-crl, @jeffswenson, and @postamar)


pkg/sql/sqlliveness/slinstance/slinstance.go line 406 at r8 (raw file):

Previously, jaylim-crl (Jay Lim) wrote…

Will be removed in next push. Consider this done.

Did you forget to push?

@jaylim-crl jaylim-crl force-pushed the jay/221109-sql-multiregion-enum branch from 4073496 to 76c4841 Compare November 18, 2022 16:45
Copy link
Collaborator Author

@jaylim-crl jaylim-crl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TFTRs!

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @jeffswenson, and @postamar)


pkg/sql/region_util.go line 1500 at r8 (raw file):

Previously, JeffSwenson (Jeff Swenson) wrote…

nit: GetRegionEnumRepresentations is a good candidate for unit testing.

Done.


pkg/sql/region_util.go line 1554 at r8 (raw file):

Previously, ajwerner wrote…

yeah, for bad reasons, it was implied.

// Required is ignored, and an error is always returned if no descriptor with
// the ID exists.

Done.


pkg/sql/sqlliveness/sqlliveness.go line 37 at r8 (raw file):

Previously, ajwerner wrote…

Please provide commentary explaining what this is. I even think you could define a type to make it read more cleanly, even if it's a type alias like like type RegionEnumValue = []byte

Done. I left it as []byte for now. Maybe when we start taking in the region physical representation in more places in the future, we can revisit this decision.


pkg/sql/sqlliveness/slinstance/slinstance.go line 406 at r8 (raw file):

Previously, ajwerner wrote…

Did you forget to push?

Done.

jaylim-crl added a commit to jaylim-crl/cockroach that referenced this pull request Nov 21, 2022
Follow up to cockroachdb#92010 and cockroachdb#91694.

This commit addresses a TODO comment by loading all region enum members from
the system database when reclaiming IDs. The list of regions is read each time
the loop runs since there's a possibiilty where regions are added or removed
from the system DB.

Epic: None

Release note: None
@jaylim-crl jaylim-crl force-pushed the jay/221109-sql-multiregion-enum branch from 76c4841 to f7fb96e Compare November 22, 2022 12:28
Copy link
Collaborator

@jeffswenson jeffswenson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

pkg/ccl/multiregionccl/multiregion_system_table_test.go Outdated Show resolved Hide resolved
pkg/ccl/multiregionccl/multiregion_system_table_test.go Outdated Show resolved Hide resolved
Copy link
Contributor

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: 🚢🇮🇹

Reviewed 6 of 9 files at r9, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @jaylim-crl, @jeffswenson, and @postamar)

Follow up on cockroachdb#90408 and cockroachdb#91019.

Previously, the sqlliveness subsystem was using enum.One when constructing the
session ID since it doesn't have the regional data yet. This commit implements
that missing part of it by ensuring that the regional representation is
plumbed all the way to the sqlliveness subsystem whenever the SQL pod is
started, and use it to construct the session ID. This will enable our REGIONAL
BY ROW work to put the data in the right region.

The multi-region enum for the system database will be read during startup for
that to work. Since doing this already requires loading the system DB's
descriptor (which indirectly tests whether the system DB has been bootstrapped)
for the tenant, we can remove the call to checkTenantExists.

Epic: None

Release note: None
@jaylim-crl jaylim-crl force-pushed the jay/221109-sql-multiregion-enum branch from f7fb96e to 3685865 Compare November 23, 2022 00:40
@jaylim-crl
Copy link
Collaborator Author

TFTRs!

bors r=JeffSwenson,ajwerner

@craig
Copy link
Contributor

craig bot commented Nov 23, 2022

Timed out.

@jaylim-crl
Copy link
Collaborator Author

bors r=JeffSwenson,ajwerner

jaylim-crl added a commit to jaylim-crl/cockroach that referenced this pull request Nov 23, 2022
Follow up to cockroachdb#92010 and cockroachdb#91694.

This commit addresses a TODO comment by loading all region enum members from
the system database when reclaiming IDs. The list of regions is read each time
the loop runs since there's a possibiilty where regions are added or removed
from the system DB.

Epic: None

Release note: None
@craig
Copy link
Contributor

craig bot commented Nov 23, 2022

Build succeeded:

@craig craig bot merged commit 5ebb4ca into cockroachdb:master Nov 23, 2022
jaylim-crl added a commit to jaylim-crl/cockroach that referenced this pull request Nov 23, 2022
Follow up to cockroachdb#92010 and cockroachdb#91694.

This commit addresses a TODO comment by loading all region enum members from
the system database when reclaiming IDs. The list of regions is read each time
the loop runs since there's a possibiilty where regions are added or removed
from the system DB.

Epic: None

Release note: None
jeffswenson pushed a commit to jeffswenson/cockroach that referenced this pull request Nov 30, 2022
Follow up to cockroachdb#92010 and cockroachdb#91694.

This commit addresses a TODO comment by loading all region enum members from
the system database when reclaiming IDs. The list of regions is read each time
the loop runs since there's a possibiilty where regions are added or removed
from the system DB.

Epic: None

Release note: None
@jaylim-crl jaylim-crl deleted the jay/221109-sql-multiregion-enum branch November 30, 2022 16:54
craig bot pushed a commit that referenced this pull request Dec 5, 2022
92035: sql: use collection in zone config hydration r=chengxiong-ruan a=chengxiong-ruan

part of #88571

Previously, we use a `getKey` function (which is basically a
kv lookup) to fetch zone configs and table descriptors required
for zone config hydration. It has redundant work of deserializing
protobufs done by descriptor collection. This commit change it
to use a helper interface to lookup zoneconfig and descriptor
directly. This also make sure that we ustilize in cache zone configs
when possible.

Release note: None

92248: instancestorage: load all regions from the system DB when reclaiming IDs r=JeffSwenson a=jaylim-crl

Follow up to #92010 and #91694.

This commit addresses a TODO comment by loading all region enum members from
the system database when reclaiming IDs. The list of regions is read each time
the loop runs since there's a possibiilty where regions are added or removed
from the system DB.

Epic: None

Release note: None

92709: multitenant: replace EXPERIMENTAL_RELOCATE StoreID to NodeID lookup r=arulajmani a=ecwall

Fixes #91628

Replace EXPERIMENTAL_RELOCATE's StoreID -> NodeID lookup's gossip impl with an in-memory cache.

Release note: None

92970: kvserver: track allocator errors via replicate queue action metrics r=AlexTalks a=AlexTalks

While allocator errors were intended to be reported via the "Replicate Queue Failures by Allocator Action" metric, i.e.
`queue.replicate.<action>.error`, these were not getting reported. This change makes sure to report these errors whenever we are not in a dry run.

Epic: None

Release note: None

Co-authored-by: Chengxiong Ruan <[email protected]>
Co-authored-by: Jay <[email protected]>
Co-authored-by: Evan Wall <[email protected]>
Co-authored-by: Alex Sarkesian <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants