Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql: gcjob and opt need ZoneConfig which is not available on tenants #49445

Closed
tbg opened this issue May 22, 2020 · 5 comments · Fixed by #49784 or #52034
Closed

sql: gcjob and opt need ZoneConfig which is not available on tenants #49445

tbg opened this issue May 22, 2020 · 5 comments · Fixed by #49784 or #52034
Assignees
Labels
A-multitenancy Related to multi-tenancy

Comments

@tbg
Copy link
Member

tbg commented May 22, 2020

See

zoneCfg, placeholder, _, err := sql.ZoneConfigHook(cfg, uint32(tableID))

The way we've designed clearing up after a DROP TABLE is fundamentally incompatible with a hard split between KV and SQL, unless we add a way for the SQL layer to figure out the zone config for a given table from the KV backend. The other way to do this is to push the responsibility for cleaning up into KV, which is a bigger change but is "better" in the sense that it doesn't rely on the SQL tenant process running (which often won't be the case after the TTL).

We should be able to hard-code a value here instead for the time being.

Similar in opt:

adjustment := 1.0 - localityMatchScore(idx.Zone(), c.locality)

We have a pretty hard dependency on reasoning about the physical location of data. Here there seems no way around it, we need to be able to resolve table ID to localities, and this becomes a blocker the moment we go geo-distributed.

cc @nvanbenschoten

@tbg tbg added the A-multitenancy Related to multi-tenancy label May 22, 2020
@blathers-crl
Copy link

blathers-crl bot commented May 22, 2020

Hi @tbg, please add a C-ategory label to your issue. Check out the label system docs.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is otan.

@tbg tbg changed the title sql: gcjob needs ZoneConfig which is not available on tenants sql: gcjob and opt need ZoneConfig which is not available on tenants May 22, 2020
@tbg
Copy link
Member Author

tbg commented May 22, 2020

The way the dependency goes here is weird enough to mention. The ZoneConfigHook lives in sql:

// ZoneConfigHook returns the zone config for the object with id using the
// cached system config. If keySuffix is within a subzone, the subzone's config
// is returned instead. The bool is set to true when the value returned is
// cached.
func ZoneConfigHook(
cfg *config.SystemConfig, id uint32,
) (*zonepb.ZoneConfig, *zonepb.ZoneConfig, bool, error) {
getKey := func(key roachpb.Key) (*roachpb.Value, error) {
return cfg.GetValue(key), nil
}
zoneID, zone, _, placeholder, err := getZoneConfig(
id, getKey, false /* getInheritedDefault */)
if errors.Is(err, errNoZoneConfigApplies) {
return nil, nil, true, nil
} else if err != nil {
return nil, nil, false, err
}
if err = completeZoneConfig(zone, zoneID, getKey); err != nil {
return nil, nil, false, err
}
return zone, placeholder, true, nil
}

The zone configs are stored in the system tenant:

// MakeZoneKey returns the key for id's entry in the system.zones table.
func MakeZoneKey(id uint32) roachpb.Key {
return keys.SystemSQLCodec.ZoneKey(id)
}

So roughly what we want is some way for non-system tenants to get (transactional?) read-only access to the zones covering their keyspace.

This all seems doable. I wouldn't guess that we need transactional access here, at least not for gcjob or opt.

@tbg
Copy link
Member Author

tbg commented May 27, 2020

cc @nvanbenschoten since this came up tangentially in #49459

@tbg
Copy link
Member Author

tbg commented May 28, 2020

Just discussed: we make it optional now (bypassing any uses of it), but down the road we should allow the tenant to read the system configs that pertain to it.

@tbg
Copy link
Member Author

tbg commented May 28, 2020

We need to figure something out though. system.zones is keyed on the system tenant's table descriptors. Exceptions (liveness ranges, etc) get hard-coded descriptor IDs. This doesn't extend to tenants. Need to discuss more and come up with a way in which this will work once we need it.

@tbg tbg added A-multitenancy-blocker and removed A-multitenancy Related to multi-tenancy labels May 28, 2020
@nvanbenschoten nvanbenschoten self-assigned this Jun 2, 2020
@nvanbenschoten nvanbenschoten added the A-multitenancy Related to multi-tenancy label Jun 2, 2020
nvanbenschoten added a commit to nvanbenschoten/cockroach that referenced this issue Jun 2, 2020
Fixes cockroachdb#49318.
Fixes cockroachdb#49445.
Progress towards cockroachdb#48123.
Informs cockroachdb#48774.

This commit introduces a new pseudo object ID in the system tenant's
namespace called "tenants". Like "liveness" and "timeseries" before it,
the pseudo object allows zone configurations to be applied to
pseudo-objects that do not live directly in the system tenant's SQL
keyspace. In this case, the "tenants" zone allows zone configurations to
be set by the system tenant and applied to all other tenants in the
system. There may come a time when we want secondary tenants to have
more control over their zone configurations, but that will require a
much larger change to the zone config structure and UX as a whole.

While making this change, we rationalize the rest of zone configuration
handling and how it relates to multi-tenancy. Now that secondary tenant
ranges have a zone config to call their own, we can make sense of calls
from SQL into the zone configuration infrastructure. We gate off calls
that don't make sense for secondary tenants and clean up hooks in SQL
that handle zone config manipulation.

All of this works towards a good cause - we eliminate the remaining uses
of `keys.TODOSQLCodec` from `pkg/sql/...` and `pkg/config/...`, bringing
us a big step closer towards being able to remove the placeholder and
close cockroachdb#48123.

This work also reveals that in order to address cockroachdb#48774, we need to be
able to determine split points from the SystemConfig. This makes it very
difficult to split on secondary tenant object (e.g. table) boundaries.
However, it makes it straightforward to split on secondary tenant
keysapce boundaries. This is already what we were thinking (see cockroachdb#47907),
so out both convenience and desire, I expect that we'll follow this up
with a PR that splits Ranges only at secondary tenant boundaries -
placing the overhead of an otherwise empty tenant at only a single Range
and a few KBs of data.
nvanbenschoten added a commit to nvanbenschoten/cockroach that referenced this issue Jun 3, 2020
Fixes cockroachdb#49318.
Fixes cockroachdb#49445.
Progress towards cockroachdb#48123.
Informs cockroachdb#48774.

This commit introduces a new pseudo object ID in the system tenant's
namespace called "tenants". Like "liveness" and "timeseries" before it,
the pseudo object allows zone configurations to be applied to
pseudo-objects that do not live directly in the system tenant's SQL
keyspace. In this case, the "tenants" zone allows zone configurations to
be set by the system tenant and applied to all other tenants in the
system. There may come a time when we want secondary tenants to have
more control over their zone configurations, but that will require a
much larger change to the zone config structure and UX as a whole.

While making this change, we rationalize the rest of zone configuration
handling and how it relates to multi-tenancy. Now that secondary tenant
ranges have a zone config to call their own, we can make sense of calls
from KV into the zone configuration infrastructure. We gate off calls
that don't make sense for secondary tenants and clean up hooks in SQL
that handle zone config manipulation.

All of this works towards a good cause - we eliminate the remaining uses
of `keys.TODOSQLCodec` from `pkg/sql/...` and `pkg/config/...`, bringing
us a big step closer towards being able to remove the placeholder and
close cockroachdb#48123.

This work also reveals that in order to address cockroachdb#48774, we need to be
able to determine split points from the SystemConfig. This makes it very
difficult to split on secondary tenant object (e.g. table) boundaries.
However, it makes it straightforward to split on secondary tenant
keysapce boundaries. This is already what we were thinking (see cockroachdb#47907),
so out both convenience and desire, I expect that we'll follow this up
with a PR that splits Ranges only at secondary tenant boundaries -
placing the overhead of an otherwise empty tenant at only a single Range
and a few KBs of data.
craig bot pushed a commit that referenced this issue Jun 4, 2020
49784: config: introduce pseudo "tenants" zone r=nvanbenschoten a=nvanbenschoten

Fixes #49318.
Fixes #49445.
Progress towards #48123.
Informs #48774.

This commit introduces a new pseudo object ID in the system tenant's namespace called "tenants". Like "liveness" and "timeseries" before it, the pseudo object allows zone configurations to be applied to pseudo-objects that do not live directly in the system tenant's SQL keyspace. In this case, the "tenants" zone allows zone configurations to be set by the system tenant and applied to all other tenants in the system. There may come a time when we want secondary tenants to have more control over their zone configurations, but that will require a much larger change to the zone config structure and UX as a whole.

While making this change, we rationalize the rest of zone configuration handling and how it relates to multi-tenancy. Now that secondary tenant ranges have a zone config to call their own, we can make sense of calls from SQL into the zone configuration infrastructure. We gate off calls that don't make sense for secondary tenants and clean up hooks in SQL that handle zone config manipulation.

All of this works towards a good cause - we eliminate the remaining uses of `keys.TODOSQLCodec` from `pkg/sql/...` and `pkg/config/...`, bringing us a big step closer towards being able to remove the placeholder and close #48123.

This work also reveals that in order to address #48774, we need to be able to determine split points from the SystemConfig. This makes it very difficult to split on secondary tenant object (e.g. table) boundaries. However, it makes it straightforward to split on secondary tenant keysapce boundaries. This is already what we were thinking (see #47907), so out both convenience and desire, I expect that we'll follow this up with a PR that splits Ranges only at secondary tenant boundaries - placing the overhead of an otherwise empty tenant at only a single Range and a few KBs of data.

Co-authored-by: Nathan VanBenschoten <[email protected]>
@craig craig bot closed this as completed in 6953182 Jun 4, 2020
@tbg tbg reopened this Jun 5, 2020
craig bot pushed a commit that referenced this issue Jun 11, 2020
50041: logictest: remove 3node-tenant blocklist from some tests r=tbg a=asubiotto

The big change here is returning nil in `crdb_internal.zones.populate`. It makes sense to me to do so in the spirit of making zone configs "optional" for phase 2 (#49445) although I'm not 100% sure this won't break anything.

Co-authored-by: Alfonso Subiotto Marques <[email protected]>
nvanbenschoten added a commit to nvanbenschoten/cockroach that referenced this issue Jul 29, 2020
… server

Fixes cockroachdb#49445.

This commit introduces a new SystemConfigProvider abstraction, which is
capable of providing the SystemConfig, as well as notifying clients of
updates to the SystemConfig. Gossip already implements this interface.
The commit then updates SQL to use this new dependency in place of
Gossip whenever it needs to access the SystemConfig.

After making this change, it then updates the kvtenant.Proxy to
implement the new SystemConfigProvider interface. This is powered by the
GossipSubscription RPC that was added in cockroachdb#50520. The commit updates the
subscription to also match on the "system-db" gossip key, and just like
that, it can provide the SystemConfig to SQL [*].

Finally, with the kvtenant.Proxy serving the role of a
SystemConfigProvider to SQL when applicable, we're able to remove gossip
entirely from StartTenant. SQL-only servers will no longer join the
gossip network, which is a nice milestone for all of this work.

[*] there are a few remaining questions about how exactly we want to
enforce an access control policy on the system config gossip pattern.
See the updated comment in `Node.GossipSubscription`. For now, we're
just returning the entire SystemConfig to the subscription.
nvanbenschoten added a commit to nvanbenschoten/cockroach that referenced this issue Jul 29, 2020
… server

Fixes cockroachdb#49445.

This commit introduces a new SystemConfigProvider abstraction, which is
capable of providing the SystemConfig, as well as notifying clients of
updates to the SystemConfig. Gossip already implements this interface.
The commit then updates SQL to use this new dependency in place of
Gossip whenever it needs to access the SystemConfig.

After making this change, it then updates the kvtenant.Proxy to
implement the new SystemConfigProvider interface. This is powered by the
GossipSubscription RPC that was added in cockroachdb#50520. The commit updates the
subscription to also match on the "system-db" gossip key, and just like
that, it can provide the SystemConfig to SQL [*].

Finally, with the kvtenant.Proxy serving the role of a
SystemConfigProvider to SQL when applicable, we're able to remove gossip
entirely from StartTenant. SQL-only servers will no longer join the
gossip network, which is a nice milestone for all of this work.

[*] there are a few remaining questions about how exactly we want to
enforce an access control policy on the system config gossip pattern.
See the updated comment in `Node.GossipSubscription`. For now, we're
just returning the entire SystemConfig to the subscription.
tbg pushed a commit to nvanbenschoten/cockroach that referenced this issue Jul 31, 2020
… server

Fixes cockroachdb#49445.

This commit introduces a new SystemConfigProvider abstraction, which is
capable of providing the SystemConfig, as well as notifying clients of
updates to the SystemConfig. Gossip already implements this interface.
The commit then updates SQL to use this new dependency in place of
Gossip whenever it needs to access the SystemConfig.

After making this change, it then updates the kvtenant.Proxy to
implement the new SystemConfigProvider interface. This is powered by the
GossipSubscription RPC that was added in cockroachdb#50520. The commit updates the
subscription to also match on the "system-db" gossip key, and just like
that, it can provide the SystemConfig to SQL [*].

Finally, with the kvtenant.Proxy serving the role of a
SystemConfigProvider to SQL when applicable, we're able to remove gossip
entirely from StartTenant. SQL-only servers will no longer join the
gossip network, which is a nice milestone for all of this work.

[*] there are a few remaining questions about how exactly we want to
enforce an access control policy on the system config gossip pattern.
See the updated comment in `Node.GossipSubscription`. For now, we're
just returning the entire SystemConfig to the subscription.
craig bot pushed a commit that referenced this issue Aug 3, 2020
52034: kvtenant: implement SystemConfigProvider, remove Gossip from SQL-only server r=nvanbenschoten a=nvanbenschoten

Fixes #49445.

This commit introduces a new SystemConfigProvider abstraction, which is capable of providing the SystemConfig, as well as notifying clients of updates to the SystemConfig. Gossip already implements this interface. The commit then updates SQL to use this new dependency in place of Gossip whenever it needs to access the SystemConfig.

After making this change, it then updates the kvtenant.Proxy to implement the new SystemConfigProvider interface. This is powered by the GossipSubscription RPC that was added in #50520. The commit updates the subscription to also match on the "system-db" gossip key, and just like that, it can provide the SystemConfig to SQL [*].

Finally, with the kvtenant.Proxy serving the role of a SystemConfigProvider to SQL when applicable, we're able to remove gossip entirely from StartTenant. SQL-only servers will no longer join the gossip network, which is a nice milestone for all of this work.

[*] there are a few remaining questions about how exactly we want to enforce an access control policy on the system config gossip pattern. See the updated comment in `Node.GossipSubscription`. For now, we're just returning the entire SystemConfig to the subscription.

Co-authored-by: Nathan VanBenschoten <[email protected]>
@craig craig bot closed this as completed in 8759499 Aug 3, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-multitenancy Related to multi-tenancy
Projects
None yet
2 participants