Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

config: introduce num_voters and voter_constraints #57184

Merged
merged 1 commit into from
Jan 28, 2021

Conversation

aayushshah15
Copy link
Contributor

@aayushshah15 aayushshah15 commented Nov 26, 2020

This commit introduces two new attributes to zone configs: num_voters
and voter_constraints. These attributes semantically act as subsets of
the existing num_replicas and constraints attributes, in that they
apply only to voting replicas.

After this change, when num_voters is explicitly specified,
num_replicas indicates the sum of voting and non-voting replicas.
Likewise, the existing constraints field governs the placement
constraints for all replicas (similarly defaulting to just voting
replicas if non-voting replicas aren't used), whereas the new
voter_constraints field governs placement constraints for just the
voting replicas. This retains backwards compatibility for existing
zone configs that do not use these newly added attributes.

Note that the set of nodes satisfying voter_constraints may not
necessarily be a subset of the set of nodes satisfying constraints.
voter_constraints are simply required to be compatible with
constraints. They can be totally disjoint from the constraints (i.e.
talk about completely different set of attributes).

Fixes #57505

Release note(sql change): Zone configs now support new attributes
num_voters and voter_constraints. num_voters will specify the
number of voting replicas. When num_voters is explicitly specified,
num_replicas will be the sum of voting and non-voting replicas.
voter_constraints will specify the constraints that govern the
placement of just the voting replicas, whereas the existing
constraints attributes will govern the placement of all replicas
(voting as well as non-voting).

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@aayushshah15 aayushshah15 force-pushed the non-voter-zone-configs-1 branch 3 times, most recently from 6c5eb00 to 67eb8ee Compare November 30, 2020 14:13
@aayushshah15 aayushshah15 marked this pull request as ready for review November 30, 2020 14:24
@aayushshah15
Copy link
Contributor Author

This needs a little bit more testing but I wanted to actually implement the allocator-level plumbing/logic (which I'm doing now) for these changes before spending too much time coming up with tests.

Copy link
Member

@nvanbenschoten nvanbenschoten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good!

Reviewed 9 of 9 files at r1.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @aayushshah15 and @andreimatei)


pkg/config/zonepb/zone.go, line 372 at r1 (raw file):

			}
			// TODO(aayush): Allowing these makes validating `voter_constraints`
			// against `constraints` harder. Revisit this decision if need be.

👍


pkg/config/zonepb/zone.go, line 417 at r1 (raw file):

	// voter_constraints = {"+ssd": 3}
	// In the current state of our zone config validation logic, allowing
	// examples like the one shown above also allows the user walk themselves into

"user to walk"


pkg/config/zonepb/zone.go, line 424 at r1 (raw file):

	// voter_constraints = {"region=C": 2, "+region=D": 1}
	//
	// TODO(aayush): We could disallow examples like the one shown above by

Let's keep the discussion on this down on the TODO DURING REVIEW below.


pkg/config/zonepb/zone.go, line 445 at r1 (raw file):

			}
		} else {
			if z.NumReplicas != nil && numConstrainedRepls > int64(*z.NumReplicas) {

Is this branch possible to hit? Won't we have already rejected this with a "voter_constraints can not be set without explicitly setting num_voters as well" error?


pkg/config/zonepb/zone.proto, line 104 at r1 (raw file):

  optional int32 num_replicas = 5 [(gogoproto.moretags) = "yaml:\"num_replicas\""];

  // NumVotersConfiguredSeparately indicates whether voting replicas are

I'm surprised that we need this. Isn't NumVoters != 0 enough to signify this? Is the complexity due to zone config inheritance?

Will this actually show up in a marshaled zone config object? It seems so.


pkg/config/zonepb/zone.proto, line 113 at r1 (raw file):

  // NumVoters specifies the desired number of voter replicas.
  optional int32 num_voters = 12 [(gogoproto.moretags) = "yaml:\"num_voters\""];

Was the plan to mark these as experimental for now? Or did you decide against that?


pkg/config/zonepb/zone.proto, line 128 at r1 (raw file):

compatible

Is this term well defined around here? I'm not familiar enough with the code to know whether this is unambiguous.


pkg/config/zonepb/zone.proto, line 137 at r1 (raw file):

  // InheritedVoterConstraints specifies if the value in the VoterConstraints
  // field was inherited from the zone's parent or specifiedly explicitly by the

s/specifiedly/specified/


pkg/sql/set_zone_config.go, line 796 at r1 (raw file):

	ctx context.Context, getNodes nodeGetter, zone *zonepb.ZoneConfig,
) error {
	if len(zone.VoterConstraints) == 0 && len(zone.Constraints) == 0 && len(zone.LeasePreferences) == 0 {

nit: switch the order here, to keep Constraints before VoterConstraints like you have elsewhere.


pkg/sql/set_zone_config_test.go, line 238 at r1 (raw file):

		{`["+foo=bar"]`, `["+foo=bar"]`, false, ""},
		{`{"+foo=bar": 1, "+foo=duck": 2}`, `{"+foo=bar": 3}`, false, ""},
		// TODO DURING REVIEW: Allowing for configurations like this one seems very

To me, the answer to this question depends on what the behavior is when zone configs are "unsatisfiable". Would we be able to make any claim about where replicas get placed? Would we first try to satisfy the general constraints and then try to satisfy the voter constraints? Or would we ever get in cases where the constraints are satisfiable, but we won't make changes because the voter constraints are preventing us from doing so?

I'm actually not even sure that we can avoid allowing this config. Isn't the plan for zone survivability to have a config that looks something like:

num_replicas = 5
constraints: {"+region=r1": 1, "+region=r2": 1, "+region=r3": 1}
voter_constraints: {"+region=r2": 3}

So in that case, voter_constraints aren't a subset of constraints.


pkg/sql/logictest/testdata/logic_test/zone_config, line 201 at r1 (raw file):


# Check that configuring num_voters separately from num_replicas behaves as
# expected, across setting them directly and through inheritance,

nit: period.

Also, were there supposed to be test cases here? Maybe those are part of what you're holding off on.

@nvanbenschoten nvanbenschoten added the A-multiregion Related to multi-region label Dec 2, 2020
Copy link
Contributor

@andreimatei andreimatei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @aayushshah15 and @nvanbenschoten)


pkg/config/zonepb/zone.go, line 372 at r1 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

👍

What validation do we currently do?
In the context of failovers, it doesn't seem crazy to want to avoid voters, but allow non-voters, in some region.


pkg/config/zonepb/zone.go, line 421 at r1 (raw file):

	// num_replicas = 3
	// num_voters = 3
	// constraints = {"+region=A": 2. "+region=B": 1}

nit: s/./,


pkg/config/zonepb/zone.go, line 422 at r1 (raw file):

	// num_voters = 3
	// constraints = {"+region=A": 2. "+region=B": 1}
	// voter_constraints = {"region=C": 2, "+region=D": 1}

nit: missing "+" ?


pkg/config/zonepb/zone.proto, line 100 at r1 (raw file):

  optional GCPolicy gc = 4 [(gogoproto.customname) = "GC"];

  // NumReplicas specifies the desired number of replicas. It is the sum of the

Nit: saying that this is a "sum" is a weird phrasing, because one of the quantities (the non-voters) is not explicitly represented in this config, so it can't be added. Just say that "this includes voting and non-voting replicas".


pkg/config/zonepb/zone.proto, line 104 at r1 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

I'm surprised that we need this. Isn't NumVoters != 0 enough to signify this? Is the complexity due to zone config inheritance?

Will this actually show up in a marshaled zone config object? It seems so.

+1


pkg/config/zonepb/zone.proto, line 109 at r1 (raw file):

  // indicates the number of voting replicas. Otherwise, `num_replicas`
  // indicates the sum of voting and non-voting replicas.
  // TODO(aayush): Maybe figure out a better name for this.

nit: definitely don't merge a TODO like this :)
What I do is prefix stuff that I plan on dealing with a !!! to make it clear that it's not intended to be committed.


pkg/config/zonepb/zone.proto, line 112 at r1 (raw file):

  optional bool num_voters_configured_separately = 15 [(gogoproto.nullable) = false];

  // NumVoters specifies the desired number of voter replicas.

comment what happens if this isn't specified: NumReplicas represents the voters (right?)


pkg/config/zonepb/zone.proto, line 122 at r1 (raw file):

  // less than ZoneConfig.num_replicas, or there must be no more than a single
  // Constraints field with num_replicas set to 0.
  // TODO(aayush): Check if the per replica constraints even accept a zone

Let's address this todo.


pkg/config/zonepb/zone.proto, line 128 at r1 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
compatible

Is this term well defined around here? I'm not familiar enough with the code to know whether this is unambiguous.

Yeah I also was going to ask for more words.


pkg/config/zonepb/zone.proto, line 139 at r1 (raw file):

  // field was inherited from the zone's parent or specifiedly explicitly by the
  // user.
  optional bool inherited_voter_constraints = 14 [(gogoproto.nullable) = false];

here and for the other similar fields, do you mind explaining that this is for disambiguating an overriding falsefrom an empty field.


pkg/sql/set_zone_config.go, line 802 at r1 (raw file):

	// Given that we have something to validate, do the work to retrieve the
	// set of attributes and localities present on at least one node.
	nodes, err := getNodes(ctx, &serverpb.NodesRequest{})

I think we want to exclude the decommissioned nodes here, and I don't think we do so currently. What do you think?
#56529 should help I think


pkg/sql/set_zone_config_test.go, line 231 at r1 (raw file):

	}{
		{``, ``, false, ""},
		{`["-foo=bar"]`, `{"+foo=bar": 1}`, true, "conflicts with voter_constraint"},

these structs are not readable. Please spell out the fields.


pkg/sql/set_zone_config_test.go, line 238 at r1 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

To me, the answer to this question depends on what the behavior is when zone configs are "unsatisfiable". Would we be able to make any claim about where replicas get placed? Would we first try to satisfy the general constraints and then try to satisfy the voter constraints? Or would we ever get in cases where the constraints are satisfiable, but we won't make changes because the voter constraints are preventing us from doing so?

I'm actually not even sure that we can avoid allowing this config. Isn't the plan for zone survivability to have a config that looks something like:

num_replicas = 5
constraints: {"+region=r1": 1, "+region=r2": 1, "+region=r3": 1}
voter_constraints: {"+region=r2": 3}

So in that case, voter_constraints aren't a subset of constraints.

Can we not validate the per-replica locality constraints to see if they're satisfiable? I don't know exactly how locality tiers work, but it seems to me like the example you give elsewhere is easy to reject:

// num_replicas = 3
// num_voters = 3
// constraints = {"+region=A": 2. "+region=B": 1}
// voter_constraints = {"region=C": 2, "+region=D": 1}

You can start by assigning the voters according to voter_constraints and then see if there's enough replicas remaining to satisfy the constraints.


pkg/sql/set_zone_config_test.go, line 254 at r1 (raw file):

		zone.NumReplicas = proto.Int32(3)

		if err := yaml.UnmarshalStrict([]byte(`constraints: `+tc.constraints), &zone); err != nil {

you don't like require.NoError()?


pkg/sql/set_zone_config_test.go, line 264 at r1 (raw file):

			require.Regexp(t, tc.errRegex, err)
		} else if tc.shouldFail {
			t.Fatalf("validation unexpectedly succeeded for constraints: %s and voter_constraints: %s", tc.constraints, tc.voterConstraints)

long line

Copy link
Contributor Author

@aayushshah15 aayushshah15 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is RFAL.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @andreimatei and @nvanbenschoten)


pkg/config/zonepb/zone.go, line 372 at r1 (raw file):

it doesn't seem crazy to want to avoid voters, but allow non-voters, in some region.

Would you be okay with reserving this discussion for a future allocator PR?


pkg/config/zonepb/zone.go, line 417 at r1 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

"user to walk"

Fixed.


pkg/config/zonepb/zone.go, line 421 at r1 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

nit: s/./,

Fixed.


pkg/config/zonepb/zone.go, line 422 at r1 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

nit: missing "+" ?

Fixed.


pkg/config/zonepb/zone.go, line 424 at r1 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Let's keep the discussion on this down on the TODO DURING REVIEW below.

👍


pkg/config/zonepb/zone.go, line 445 at r1 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Is this branch possible to hit? Won't we have already rejected this with a "voter_constraints can not be set without explicitly setting num_voters as well" error?

It was possible to hit, because we call Validate() before we call ValidateTandemFields(). But this branch here is indeed incorrect because this error doesn't make sense. Should be fixed now.


pkg/config/zonepb/zone.proto, line 100 at r1 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

Nit: saying that this is a "sum" is a weird phrasing, because one of the quantities (the non-voters) is not explicitly represented in this config, so it can't be added. Just say that "this includes voting and non-voting replicas".

Done.


pkg/config/zonepb/zone.proto, line 104 at r1 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

+1

You're right it's redundant. Removed.


pkg/config/zonepb/zone.proto, line 109 at r1 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

nit: definitely don't merge a TODO like this :)
What I do is prefix stuff that I plan on dealing with a !!! to make it clear that it's not intended to be committed.

Noted.


pkg/config/zonepb/zone.proto, line 112 at r1 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

comment what happens if this isn't specified: NumReplicas represents the voters (right?)

Done.


pkg/config/zonepb/zone.proto, line 113 at r1 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Was the plan to mark these as experimental for now? Or did you decide against that?

I know we'd briefly talked about this, but I was hoping to shelve that change until the stability period and see how things shape up once we subject these changes to some randomized testing. How do you feel about that? I'm mostly concerned that the experimental prefix would make it seem like all of non-voting replicas are experimental, which IMO would be too pessimistic.

OTOH I think if we decide to expose an option to dynamically compute num_replicas off of constraints, then that should definitely be experimental.


pkg/config/zonepb/zone.proto, line 122 at r1 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

Let's address this todo.

Done. They do accept num_replicas=0 and it works as advertised.


pkg/config/zonepb/zone.proto, line 128 at r1 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

Yeah I also was going to ask for more words.

I added a definition for it here. Lmk if its lacking.


pkg/config/zonepb/zone.proto, line 137 at r1 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

s/specifiedly/specified/

Fixed.


pkg/config/zonepb/zone.proto, line 139 at r1 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

here and for the other similar fields, do you mind explaining that this is for disambiguating an overriding falsefrom an empty field.

Done.


pkg/sql/set_zone_config.go, line 796 at r1 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

nit: switch the order here, to keep Constraints before VoterConstraints like you have elsewhere.

Done.


pkg/sql/set_zone_config.go, line 802 at r1 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

I think we want to exclude the decommissioned nodes here, and I don't think we do so currently. What do you think?
#56529 should help I think

It seems that we do exclude decommissioned nodes after that PR, I just tested it out.


pkg/sql/set_zone_config_test.go, line 231 at r1 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

these structs are not readable. Please spell out the fields.

Done.


pkg/sql/set_zone_config_test.go, line 238 at r1 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

Can we not validate the per-replica locality constraints to see if they're satisfiable? I don't know exactly how locality tiers work, but it seems to me like the example you give elsewhere is easy to reject:

// num_replicas = 3
// num_voters = 3
// constraints = {"+region=A": 2. "+region=B": 1}
// voter_constraints = {"region=C": 2, "+region=D": 1}

You can start by assigning the voters according to voter_constraints and then see if there's enough replicas remaining to satisfy the constraints.

That example might be easy to reject but note that we have no guarantees about these constraints being non-overlapping and that's where the pernicious cases are. Consider something like:

num_replicas = 5
num_voters = 3
constraints = {"+region=A": 3, "+region=B": 1}
voter_constraints = {"+ssd": 2}

I think in order to do properly accept or reject these sorts of cases, we'd need to lift/duplicate the AnalyzeConstraints() and allocateTargetsFromList() machinery up to this level. Since this is SQL-level validation logic, I'm not sure that's the right thing to do.

I'd prefer to reserve the discussions about placement guarantees in the face of constraint violations for the PR(s) dealing with the allocator. With the current approach of simply ensuring voter_constraints are not directly contradictory to constraints, I think I convinced myself that we shouldn't run into any real issues providing sane semantics for situations where we can place voting replicas without violating constraints but can't place non-voting replicas after doing that.


pkg/sql/set_zone_config_test.go, line 254 at r1 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

you don't like require.NoError()?

Done.


pkg/sql/set_zone_config_test.go, line 264 at r1 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

long line

Fixed.


pkg/sql/logictest/testdata/logic_test/zone_config, line 201 at r1 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

nit: period.

Also, were there supposed to be test cases here? Maybe those are part of what you're holding off on.

I cleaned this part up a bit.


pkg/sql/logictest/testdata/logic_test/zone_config, line 251 at r2 (raw file):

# instead.
statement ok
ALTER TABLE a CONFIGURE ZONE USING num_voters = COPY FROM PARENT

@nvanbenschoten I had this mentioned this^ bit to Andrei but flagging it here so you dont miss it.

On a related note, I'm not sure how to handle the same scenario for voter_constraints = COPY FROM PARENT. To me, copying the parent's constraints value would feel a little counter-intuitive but I don't have a principled argument for it.

Copy link
Contributor

@andreimatei andreimatei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @aayushshah15, @andreimatei, and @nvanbenschoten)


pkg/config/zonepb/zone.go, line 372 at r1 (raw file):

Previously, aayushshah15 (Aayush Shah) wrote…

it doesn't seem crazy to want to avoid voters, but allow non-voters, in some region.

Would you be okay with reserving this discussion for a future allocator PR?

yes


pkg/config/zonepb/zone.go, line 324 at r2 (raw file):

			return fmt.Errorf("at least 3 voting replicas are required for multi-replica configurations")
		}
		if z.NumReplicas != nil && *z.NumVoters > *z.NumReplicas {

can NumReplicas be nil whenNumVoters is set? Don't we (want to) require both? Or are you thinking that NumReplicas can be inherited? If it is, is there a similar validation elsewhere that takes into account the inherited value?
I'm thinking maybe it's best to require a NumReplicas when NumVoters is set. Like, you either inherit both or you don't inherit either.


pkg/config/zonepb/zone.proto, line 100 at r1 (raw file):

Previously, aayushshah15 (Aayush Shah) wrote…

Done.

consider spelling out either here or below that there are no non-voters if NumVoters is not set.


pkg/config/zonepb/zone.proto, line 113 at r1 (raw file):

Previously, aayushshah15 (Aayush Shah) wrote…

I know we'd briefly talked about this, but I was hoping to shelve that change until the stability period and see how things shape up once we subject these changes to some randomized testing. How do you feel about that? I'm mostly concerned that the experimental prefix would make it seem like all of non-voting replicas are experimental, which IMO would be too pessimistic.

OTOH I think if we decide to expose an option to dynamically compute num_replicas off of constraints, then that should definitely be experimental.

I personally don't think that an experimental keyword would help (in fact, I've come to believe that it rarely does).


pkg/config/zonepb/zone.proto, line 122 at r1 (raw file):

  // less than ZoneConfig.num_replicas, or there must be no more than a single
  // Constraints field with num_replicas set to 0.
  // TODO(aayush): Check if the per replica constraints even accept a zone

perhaps address this


pkg/config/zonepb/zone.proto, line 113 at r2 (raw file):

  //
  // NOTE: The sum of the num_replicas fields of the Constraints must add up to
  // less than ZoneConfig.num_replicas, or there must be no more than a single

less or equal (iow "at most"), right?


pkg/sql/logictest/testdata/logic_test/zone_config, line 251 at r2 (raw file):

Previously, aayushshah15 (Aayush Shah) wrote…

@nvanbenschoten I had this mentioned this^ bit to Andrei but flagging it here so you dont miss it.

On a related note, I'm not sure how to handle the same scenario for voter_constraints = COPY FROM PARENT. To me, copying the parent's constraints value would feel a little counter-intuitive but I don't have a principled argument for it.

So what's the alternative again?
If I set num_replicas to 100, and then I say num_voters = COPY FROM PARENT, I think I wouldn't expect to end up with 100 voters.

In fact I'm wondering if we should go even further: if I say num_replicas = COPY FROM PARENT, and I don't have num_voters set and I'm also not setting it in the same statement, should I be getting the parent's num_voters (if there is one) perhaps?


pkg/sql/opt/cat/zone.go, line 27 at r2 (raw file):

	// are part of this zone.
	//
	// TODO(aayush): Go through the callers of the methods here and decide the

I think the time for this is now :)

Copy link
Member

@nvanbenschoten nvanbenschoten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 10 of 10 files at r2.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @aayushshah15 and @andreimatei)


pkg/config/zonepb/zone.proto, line 113 at r1 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

I personally don't think that an experimental keyword would help (in fact, I've come to believe that it rarely does).

I think I now agree that we should omit the experimental.


pkg/sql/set_zone_config.go, line 802 at r1 (raw file):

Previously, aayushshah15 (Aayush Shah) wrote…

It seems that we do exclude decommissioned nodes after that PR, I just tested it out.

Yep, that PR addresses this.


pkg/sql/set_zone_config_test.go, line 238 at r1 (raw file):

I'd prefer to reserve the discussions about placement guarantees in the face of constraint violations for the PR(s) dealing with the allocator. With the current approach of simply ensuring voter_constraints are not directly contradictory to constraints, I think I convinced myself that we shouldn't run into any real issues providing sane semantics for situations where we can place voting replicas without violating constraints but can't place non-voting replicas after doing that.

This all SGTM. I agree that this should be a concern of the allocator and not of the zone config validation logic.


pkg/sql/logictest/testdata/logic_test/zone_config, line 251 at r2 (raw file):

If I set num_replicas to 100, and then I say num_voters = COPY FROM PARENT, I think I wouldn't expect to end up with 100 voters.

Agreed.

This discussion seems more complicated than strictly necessary. What's the argument for doing anything more than copying where indicated and then letting the normal validation logic decide whether to reject or not? Is there precedent for a COPY FROM PARENT for one field having an impact on another field?


pkg/sql/opt/cat/zone.go, line 27 at r2 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

I think the time for this is now :)

+1

Copy link
Contributor Author

@aayushshah15 aayushshah15 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @andreimatei and @nvanbenschoten)


pkg/config/zonepb/zone.go, line 324 at r2 (raw file):

can NumReplicas be nil whenNumVoters is set?

Nope, NumReplicas can actually never be nil, regardless of whether NumVoters is set or not but I think there might be a unit test or two that triggers some code paths with an empty zone config. I don't really like the fact that we're doing this z.NumReplicas != nil (and similar for other fields) dance everywhere but it's just a consequence of the fact that these fields were (perhaps mistakenly) defined as nullable.

If it is, is there a similar validation elsewhere that takes into account the inherited value?

This is done inside ValidateTandemFields which is called after.


pkg/config/zonepb/zone.proto, line 100 at r1 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

consider spelling out either here or below that there are no non-voters if NumVoters is not set.

Added a phrase below.


pkg/config/zonepb/zone.proto, line 113 at r1 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

I think I now agree that we should omit the experimental.

👍


pkg/config/zonepb/zone.proto, line 122 at r1 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

perhaps address this

Do you mean we should remove support for that kind of zone config constraint? For context, here is the current behavior:

[email protected]:26257/movr> alter table a configure zone using constraints='{"+region=europe-west1": 0, "+region=us-east1": 1}';
ERROR: could not validate zone config: constraints must apply to at least one replica
SQLSTATE: 23514

[email protected]:26257/movr> alter table a configure zone using constraints='{"+region=europe-west1": 0}';
CONFIGURE ZONE 1

Time: 18ms total (execution 18ms / network 0ms)

"+region=europe-west1": 0 is treated the same as "+region=europe-west1" in one context but not another. I quite dislike it and think we should change it, but not in this PR. What do you think?


pkg/config/zonepb/zone.proto, line 113 at r2 (raw file):

Previously, andreimatei (Andrei Matei) wrote…

less or equal (iow "at most"), right?

Corrected.


pkg/sql/set_zone_config_test.go, line 238 at r1 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

I'd prefer to reserve the discussions about placement guarantees in the face of constraint violations for the PR(s) dealing with the allocator. With the current approach of simply ensuring voter_constraints are not directly contradictory to constraints, I think I convinced myself that we shouldn't run into any real issues providing sane semantics for situations where we can place voting replicas without violating constraints but can't place non-voting replicas after doing that.

This all SGTM. I agree that this should be a concern of the allocator and not of the zone config validation logic.

👍


pkg/sql/logictest/testdata/logic_test/zone_config, line 251 at r2 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

If I set num_replicas to 100, and then I say num_voters = COPY FROM PARENT, I think I wouldn't expect to end up with 100 voters.

Agreed.

This discussion seems more complicated than strictly necessary. What's the argument for doing anything more than copying where indicated and then letting the normal validation logic decide whether to reject or not? Is there precedent for a COPY FROM PARENT for one field having an impact on another field?

It's mostly a question of, when the user runs num_voters = COPY FROM PARENT are they expecting to get the "effective number of voting replicas on the parent" or strictly the value of the field. However, for now I'm going change it such that we're strictly copying the value of the field from the parent, without any magic.


pkg/sql/opt/cat/zone.go, line 27 at r2 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

+1

Created #59482 to track this.

@aayushshah15 aayushshah15 force-pushed the non-voter-zone-configs-1 branch 2 times, most recently from 0c2c6e8 to d44fd27 Compare January 27, 2021 20:25
@aayushshah15
Copy link
Contributor Author

I'm going to optimistically merge this when CI is green. There's a discussion to be had with regards to the behavior of "+<constraint>": 0, but that's not pertinent to this PR and we should deal with it separately.

@aayushshah15 aayushshah15 force-pushed the non-voter-zone-configs-1 branch from d44fd27 to 409a6e5 Compare January 27, 2021 21:04
This commit introduces two new attributes to zone configs: `num_voters`
and `voter_constraints`. These attributes semantically act as subsets of
the existing `num_replicas` and `constraints` attributes, in that they
apply _only_ to voting replicas.

After this change, when `num_voters` is explicitly specified,
`num_replicas` indicates the sum of voting and non-voting replicas.
Likewise, the existing `constraints` field governs the placement
constraints for all replicas (similarly defaulting to just voting
replicas if non-voting replicas aren't used), whereas the new
`voter_constraints` field governs placement constraints for just the
voting replicas. *This retains backwards compatibility* for existing
zone configs that do not use these newly added attributes.

Note that the set of nodes satisfying `voter_constraints` may not
necessarily be a subset of the set of nodes satisfying `constraints`.
`voter_constraints` are simply required to be compatible with
`constraints`. They can be totally disjoint from the `constraints` (i.e.
talk about completely different set of attributes).

Release note (sql change): Zone configs now support new attributes
`num_voters` and `voter_constraints`. `num_voters` will specify the
number of voting replicas. When `num_voters` is explicitly specified,
`num_replicas` will be the sum of voting and non-voting replicas.
`voter_constraints` will specify the constraints that govern the
placement of just the voting replicas, whereas the existing
`constraints` attributes will govern the placement of all replicas
(voting as well as non-voting).
@aayushshah15 aayushshah15 force-pushed the non-voter-zone-configs-1 branch from 06a9b3d to 836f39c Compare January 28, 2021 03:57
@aayushshah15
Copy link
Contributor Author

TFTRs!

bors r=andreimatei,nvanbenschoten

@craig
Copy link
Contributor

craig bot commented Jan 28, 2021

Build succeeded:

@craig craig bot merged commit e753849 into cockroachdb:master Jan 28, 2021
aayushshah15 added a commit to aayushshah15/cockroach that referenced this pull request Jan 29, 2021
After the introduction of cockroachdb#57184, we can constrain voting replicas
separately from non-voting replicas using the new attributes
`num_voters` and `voter_constraints`. This commit connects our
multi-region DDL statements to leverage these new attributes.

Broadly speaking,
- The existing `constraints` and `num_replicas` fields are set at the
database level, to be inherited by table/partition level zone configs.
- The new attributes: `num_voters` and `voter_constraints` (along with
accompanying `lease_preferences` for these voters) are set at the
table/partition level.

This brings the implementation of these DDL statements inline with the
functional specfication.

Fixes cockroachdb#57663

Release note: None
aayushshah15 added a commit to aayushshah15/cockroach that referenced this pull request Jan 31, 2021
After the introduction of cockroachdb#57184, we can constrain voting replicas
separately from non-voting replicas using the new attributes
`num_voters` and `voter_constraints`. This commit connects our
multi-region DDL statements to leverage these new attributes.

Broadly speaking,
- The existing `constraints` and `num_replicas` fields are set at the
database level, to be inherited by table/partition level zone configs.
- The new attributes: `num_voters` and `voter_constraints` (along with
accompanying `lease_preferences` for these voters) are set at the
table/partition level.

This brings the implementation of these DDL statements inline with the
functional specfication.

Fixes cockroachdb#57663

Release note: None
aayushshah15 added a commit to aayushshah15/cockroach that referenced this pull request Jan 31, 2021
After the introduction of cockroachdb#57184, we can constrain voting replicas
separately from non-voting replicas using the new attributes
`num_voters` and `voter_constraints`. This commit connects our
multi-region DDL statements to leverage these new attributes.

Broadly speaking,
- The existing `constraints` and `num_replicas` fields are set at the
database level, to be inherited by table/partition level zone configs.
- The new attributes: `num_voters` and `voter_constraints` (along with
accompanying `lease_preferences` for these voters) are set at the
table/partition level.

This brings the implementation of these DDL statements inline with the
functional specfication.

Fixes cockroachdb#57663

Release note: None
aayushshah15 added a commit to aayushshah15/cockroach that referenced this pull request Jan 31, 2021
After the introduction of cockroachdb#57184, we can constrain voting replicas
separately from non-voting replicas using the new attributes
`num_voters` and `voter_constraints`. This commit connects our
multi-region DDL statements to leverage these new attributes.

Broadly speaking,
- The existing `constraints` and `num_replicas` fields are set at the
database level, to be inherited by table/partition level zone configs.
- The new attributes: `num_voters` and `voter_constraints` (along with
accompanying `lease_preferences` for these voters) are set at the
table/partition level.

This brings the implementation of these DDL statements inline with the
functional specfication.

Fixes cockroachdb#57663

Release note: None
aayushshah15 added a commit to aayushshah15/cockroach that referenced this pull request Feb 1, 2021
After the introduction of cockroachdb#57184, we can constrain voting replicas
separately from non-voting replicas using the new attributes
`num_voters` and `voter_constraints`. This commit connects our
multi-region DDL statements to leverage these new attributes.

Broadly speaking,
- The existing `constraints` and `num_replicas` fields are set at the
database level, to be inherited by table/partition level zone configs.
- The new attributes: `num_voters` and `voter_constraints` (along with
accompanying `lease_preferences` for these voters) are set at the
table/partition level.

This brings the implementation of these DDL statements inline with the
functional specfication.

Fixes cockroachdb#57663

Release note: None
aayushshah15 added a commit to aayushshah15/cockroach that referenced this pull request Feb 1, 2021
After the introduction of cockroachdb#57184, we can constrain voting replicas
separately from non-voting replicas using the new attributes
`num_voters` and `voter_constraints`. This commit connects our
multi-region DDL statements to leverage these new attributes.

Broadly speaking,
- The existing `constraints` and `num_replicas` fields are set at the
database level, to be inherited by table/partition level zone configs.
- The new attributes: `num_voters` and `voter_constraints` (along with
accompanying `lease_preferences` for these voters) are set at the
table/partition level.

This brings the implementation of these DDL statements inline with the
functional specfication.

Fixes cockroachdb#57663

Release note: None
craig bot pushed a commit that referenced this pull request Feb 1, 2021
58436: server: add /api/v2/ tree with auth/pagination, port listSessions endpoint r=itsbilal a=itsbilal

This change adds the skeleton for a new API tree that lives in
`/api/v2/` in the http listener, and currently reimplements the `/sessions/`
endpoint that is also implemented in `/_status/`. The new v2 API tree avoids
the need to use GRPC Gateway, as well as cookie-based authentication which is
less intuitive and idiomatic for REST APIs. Instead, for authentication,
it uses a new session header that needs to be set on every request.

As many RPC fan-out APIs use statusServer.iterateNodes, this change
implements a pagination-aware method, paginatedIterateNodes, that
works on a sorted set of node IDs and arranges results in such a way
to be able to return the next `limit` results of an arbitary slice
after the `next` cursor passed in. An example of how this works in practice
is the new `/api/v2/sessions/` endpoint.

A dependency on gorilla/mux is added to be able to pattern-match
arguments in the URL. This was already an indirect dependency; now it's
a direct dependency of cockroach.

TODO that are likely to fall over into future PRs:
 - API Documentation, need to explore using swagger here.
 - Porting over remaining /_admin/ and /_status/ APIs, incl. SQL based ones

Part of #55947.

Release note (api change): Adds a new API tree, in /api/v2/*, currently
undocumented, that avoids the use of and cookie-based
authentication in favour of sessions in headers, and support for pagination.

58906: sql: improve interface for injecting descriptors into internal executor r=lucy-zhang a=lucy-zhang

As part of validating new schema elements (unique indexes, constraints)
in the schema changer, we inject in-memory descriptors to be used by the
internal executor that are never written to disk.

This is currently done by creating an entirely new dummy
`descs.Collection` and adding the injected descriptors as uncommitted
descriptors, and then setting this collection as the `tcModifier` on the
internal executor.  Then all subsequent queries using the internal
executor, which each get their own child `connExecutor` and
`descs.Collection`, will have their collection's uncommitted descriptors
replaced by the ones in the dummy collection.

This commit introduces the concept of a "synthetic descriptor" to refer
to these injected descriptors, and slightly improves the interface in
two ways:

1. Instead of creating a new `descs.Collection` to hold synthetic
   descriptors to copy, we now just use a slice of
   `catalog.Descriptor`s.
2. Synthetic descriptors are now made explicit in the `descs.Collection`
   and precede uncommitted descriptors when resolving immutable
   descriptors. Resolving these descriptors mutably is now illegal and
   will return an assertion error.

This commit doesn't change the fact that the synthetic descriptors to be
injected into each query/statement's child `descs.Collection` are set
on the internal executor itself. This is still not threadsafe. It would
make more sense for these descriptors to be scoped at the level of
individual queries.

Related to #34304.

Release note: None

59258: changefeedccl: Freeze table name to the (optionally fully qualified) statement time name r=[miretskiy] a=HonoreDB

Previously, Kafka topics and top-level keys were always derived from the
table name in changefeeds. If the table name changed, the feed
eventually failed, and if the table name was non-unique across
databases, collisions were unavoidable. This PR adds a WITH
full_table_name option to changefeeds, and honors it by serializing
movr.public.drivers as the statement time name and relying on that.

There are probably more things that need to change downstream.

Release note (sql change): Added "WITH full_table_name" option to create
a changefeed on "movr.public.drivers" instead of
"drivers".

59281: sql: prevent DROP SCHEMA CASCADE from droping types with references r=ajwerner a=ajwerner

Before this patch, a DROP SCHEMA CASCADE could cause database corruption
by dropping types which were referenced by other tables. This would lead to
bad errors like:

```
ERROR: object already exists: desc 687: type ID 685 in descriptor not found: descriptor not found
SQLSTATE: 42710
```

And doctor errors like:
```
   Table 687: ParentID  50, ParentSchemaID 29, Name 't': type ID 685 in descriptor not found: descriptor not found
```

Fixes #59021.

Release note (bug fix): Fixed a bug where `DROP SCHEMA ... CASCADE` could
result in types which are referenced being dropped.

59591: sql: hook up multi-region DDL to new zone config attributes r=aayushshah15 a=aayushshah15

After the introduction of #57184, we can constrain voting replicas
separately from non-voting replicas using the new attributes
`num_voters` and `voter_constraints`. This commit connects our
multi-region DDL statements to leverage these new attributes.

Broadly speaking,
- The existing `constraints` and `num_replicas` fields are set at the
database level, to be inherited by table/partition level zone configs.
- The new attributes: `num_voters` and `voter_constraints` (along with
accompanying `lease_preferences` for these voters) are set at the
table/partition level.

This brings the implementation of these DDL statements inline with the
functional specfication.

Fixes #57663

Release note: None


59659: sql: fix bug preventing adding FKs referencing hidden columns r=lucy-zhang a=lucy-zhang

The validation query for adding foreign keys had a pointless `SELECT *`
on the referenced table that caused hidden columns to be omitted,
so attempting to add foreign key constraints referencing hidden columns
would fail. This PR fixes the query.

Fixes #59582.

Release note (bug fix): Fixed a bug preventing foreign key constraints
referencing hidden columns (e.g., `rowid`) from being added.

Co-authored-by: Bilal Akhtar <[email protected]>
Co-authored-by: Lucy Zhang <[email protected]>
Co-authored-by: Aaron Zinger <[email protected]>
Co-authored-by: Andrew Werner <[email protected]>
Co-authored-by: Aayush Shah <[email protected]>
@aayushshah15
Copy link
Contributor Author

I'm surprised that we need this. Isn't NumVoters != 0 enough to signify this? Is the complexity due to zone config inheritance?

You're right it's redundant. Removed.

I think I messed up there^

Today's slack discussion with @ajstorm (https://cockroachlabs.slack.com/archives/C4ERHM60Z/p1614652525068100?thread_ts=1614646058.057900&cid=C4ERHM60Z) and my conversation with @otan made me realize that we do indeed need something like the NumVotersConfiguredSeparately field I ended up removing from this PR.

To see why we need something like it, consider a database level zone config and a table-level zone config. At the table level, if we see a nil NumVoters, we need to be able to disambiguate whether

  • NumVoters was never separately configured (so we need to treat NumReplicas as the number of voters)
    or
  • it is to be inherited from the database level zone config.

Currently, on master, we will always assume that it is to be inherited from the database level zone config, which is wrong.

Additionally, as @otan pointed out, pre-21.1 zone configs will be unmarshalled onto a 21.1 ZoneConfig proto which will have the InheritedVoterConstraints field set to false. This means that we're always assuming that a pre 21.1 zone config has explicitly configured zone config constraints. The aforementioned field is the additional bit of info that we need to resolve these cases.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-multiregion Related to multi-region
Projects
None yet
Development

Successfully merging this pull request may close these issues.

config: new config syntax to enable creation of non voting replicas
4 participants