Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kv: consider lower number of splits inside of a tenant's prefix #47907

Closed
tbg opened this issue Apr 22, 2020 · 2 comments
Closed

kv: consider lower number of splits inside of a tenant's prefix #47907

tbg opened this issue Apr 22, 2020 · 2 comments
Assignees
Labels
A-multitenancy Related to multi-tenancy

Comments

@tbg
Copy link
Member

tbg commented Apr 22, 2020

See this doc.

At the time of writing, a tenant is initialized with dozens of splits which are mostly useless since the adjacent ranges are mostly inactive/empty and share a zone config.

If the number of replicas per node ever becomes an issue (i.e. if thousands of tenants share a small cluster) we should consider omitting these unused splits. Instead, a split should only happen by load, size, or zone config (the latter of which the tenant is unlikely to ever be handed free reigns over).

@tbg tbg added A-multitenancy-blocker A-multitenancy Related to multi-tenancy labels Apr 22, 2020
@nvanbenschoten nvanbenschoten self-assigned this May 14, 2020
nvanbenschoten added a commit to nvanbenschoten/cockroach that referenced this issue Jun 2, 2020
Fixes cockroachdb#49318.
Fixes cockroachdb#49445.
Progress towards cockroachdb#48123.
Informs cockroachdb#48774.

This commit introduces a new pseudo object ID in the system tenant's
namespace called "tenants". Like "liveness" and "timeseries" before it,
the pseudo object allows zone configurations to be applied to
pseudo-objects that do not live directly in the system tenant's SQL
keyspace. In this case, the "tenants" zone allows zone configurations to
be set by the system tenant and applied to all other tenants in the
system. There may come a time when we want secondary tenants to have
more control over their zone configurations, but that will require a
much larger change to the zone config structure and UX as a whole.

While making this change, we rationalize the rest of zone configuration
handling and how it relates to multi-tenancy. Now that secondary tenant
ranges have a zone config to call their own, we can make sense of calls
from SQL into the zone configuration infrastructure. We gate off calls
that don't make sense for secondary tenants and clean up hooks in SQL
that handle zone config manipulation.

All of this works towards a good cause - we eliminate the remaining uses
of `keys.TODOSQLCodec` from `pkg/sql/...` and `pkg/config/...`, bringing
us a big step closer towards being able to remove the placeholder and
close cockroachdb#48123.

This work also reveals that in order to address cockroachdb#48774, we need to be
able to determine split points from the SystemConfig. This makes it very
difficult to split on secondary tenant object (e.g. table) boundaries.
However, it makes it straightforward to split on secondary tenant
keysapce boundaries. This is already what we were thinking (see cockroachdb#47907),
so out both convenience and desire, I expect that we'll follow this up
with a PR that splits Ranges only at secondary tenant boundaries -
placing the overhead of an otherwise empty tenant at only a single Range
and a few KBs of data.
nvanbenschoten added a commit to nvanbenschoten/cockroach that referenced this issue Jun 3, 2020
Fixes cockroachdb#49318.
Fixes cockroachdb#49445.
Progress towards cockroachdb#48123.
Informs cockroachdb#48774.

This commit introduces a new pseudo object ID in the system tenant's
namespace called "tenants". Like "liveness" and "timeseries" before it,
the pseudo object allows zone configurations to be applied to
pseudo-objects that do not live directly in the system tenant's SQL
keyspace. In this case, the "tenants" zone allows zone configurations to
be set by the system tenant and applied to all other tenants in the
system. There may come a time when we want secondary tenants to have
more control over their zone configurations, but that will require a
much larger change to the zone config structure and UX as a whole.

While making this change, we rationalize the rest of zone configuration
handling and how it relates to multi-tenancy. Now that secondary tenant
ranges have a zone config to call their own, we can make sense of calls
from KV into the zone configuration infrastructure. We gate off calls
that don't make sense for secondary tenants and clean up hooks in SQL
that handle zone config manipulation.

All of this works towards a good cause - we eliminate the remaining uses
of `keys.TODOSQLCodec` from `pkg/sql/...` and `pkg/config/...`, bringing
us a big step closer towards being able to remove the placeholder and
close cockroachdb#48123.

This work also reveals that in order to address cockroachdb#48774, we need to be
able to determine split points from the SystemConfig. This makes it very
difficult to split on secondary tenant object (e.g. table) boundaries.
However, it makes it straightforward to split on secondary tenant
keysapce boundaries. This is already what we were thinking (see cockroachdb#47907),
so out both convenience and desire, I expect that we'll follow this up
with a PR that splits Ranges only at secondary tenant boundaries -
placing the overhead of an otherwise empty tenant at only a single Range
and a few KBs of data.
nvanbenschoten added a commit to nvanbenschoten/cockroach that referenced this issue Jun 4, 2020
Fixes cockroachdb#49318.
Fixes cockroachdb#49445.
Progress towards cockroachdb#48123.
Informs cockroachdb#48774.

This commit introduces a new pseudo object ID in the system tenant's
namespace called "tenants". Like "liveness" and "timeseries" before it,
the pseudo object allows zone configurations to be applied to
pseudo-objects that do not live directly in the system tenant's SQL
keyspace. In this case, the "tenants" zone allows zone configurations to
be set by the system tenant and applied to all other tenants in the
system. There may come a time when we want secondary tenants to have
more control over their zone configurations, but that will require a
much larger change to the zone config structure and UX as a whole.

While making this change, we rationalize the rest of zone configuration
handling and how it relates to multi-tenancy. Now that secondary tenant
ranges have a zone config to call their own, we can make sense of calls
from KV into the zone configuration infrastructure. We gate off calls
that don't make sense for secondary tenants and clean up hooks in SQL
that handle zone config manipulation.

All of this works towards a good cause - we eliminate the remaining uses
of `keys.TODOSQLCodec` from `pkg/sql/...` and `pkg/config/...`, bringing
us a big step closer towards being able to remove the placeholder and
close cockroachdb#48123.

This work also reveals that in order to address cockroachdb#48774, we need to be
able to determine split points from the SystemConfig. This makes it very
difficult to split on secondary tenant object (e.g. table) boundaries.
However, it makes it straightforward to split on secondary tenant
keysapce boundaries. This is already what we were thinking (see cockroachdb#47907),
so out both convenience and desire, I expect that we'll follow this up
with a PR that splits Ranges only at secondary tenant boundaries -
placing the overhead of an otherwise empty tenant at only a single Range
and a few KBs of data.
craig bot pushed a commit that referenced this issue Jun 4, 2020
49784: config: introduce pseudo "tenants" zone r=nvanbenschoten a=nvanbenschoten

Fixes #49318.
Fixes #49445.
Progress towards #48123.
Informs #48774.

This commit introduces a new pseudo object ID in the system tenant's namespace called "tenants". Like "liveness" and "timeseries" before it, the pseudo object allows zone configurations to be applied to pseudo-objects that do not live directly in the system tenant's SQL keyspace. In this case, the "tenants" zone allows zone configurations to be set by the system tenant and applied to all other tenants in the system. There may come a time when we want secondary tenants to have more control over their zone configurations, but that will require a much larger change to the zone config structure and UX as a whole.

While making this change, we rationalize the rest of zone configuration handling and how it relates to multi-tenancy. Now that secondary tenant ranges have a zone config to call their own, we can make sense of calls from SQL into the zone configuration infrastructure. We gate off calls that don't make sense for secondary tenants and clean up hooks in SQL that handle zone config manipulation.

All of this works towards a good cause - we eliminate the remaining uses of `keys.TODOSQLCodec` from `pkg/sql/...` and `pkg/config/...`, bringing us a big step closer towards being able to remove the placeholder and close #48123.

This work also reveals that in order to address #48774, we need to be able to determine split points from the SystemConfig. This makes it very difficult to split on secondary tenant object (e.g. table) boundaries. However, it makes it straightforward to split on secondary tenant keysapce boundaries. This is already what we were thinking (see #47907), so out both convenience and desire, I expect that we'll follow this up with a PR that splits Ranges only at secondary tenant boundaries - placing the overhead of an otherwise empty tenant at only a single Range and a few KBs of data.

Co-authored-by: Nathan VanBenschoten <[email protected]>
@tbg
Copy link
Member Author

tbg commented Jul 8, 2020

@nvanbenschoten this can be closed, right?

@tbg tbg closed this as completed Jul 8, 2020
@nvanbenschoten
Copy link
Member

Yes, we currently don't create any mandatory splits inside of a tenant prefix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-multitenancy Related to multi-tenancy
Projects
None yet
Development

No branches or pull requests

2 participants