Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kv: observed timestamps and multi-tenancy #48008

Closed
tbg opened this issue Apr 24, 2020 · 8 comments
Closed

kv: observed timestamps and multi-tenancy #48008

tbg opened this issue Apr 24, 2020 · 8 comments
Assignees
Labels
A-multitenancy Related to multi-tenancy

Comments

@tbg
Copy link
Member

tbg commented Apr 24, 2020

Background

Observed timestamps have an important optimization: the gateway node starting a txn marks itself as certain using a timestamp off its local clock (which is shared with the local KV server).
In a tenant process, this isn’t possible (there’s no local KV server and hence no NodeID nor a shared clock).

In phase 2, there won't be more than one gateway running at any given time (per tenant), so all of the tenant's transactions will pass through that one gateway's clock anyway, meaning there shouldn't be too many new uncertainty restarts (in fact there shouldn't be any uncertainty, assuming we wait out clock uncertainty at process start).

The blocker for phase 2 is just to make sure we set the uncertainty interval that is applied to new transactions to zero.

Once we are running more than one SQL gateway per tenant the loss of observed timestamps may hurt and a solution will be needed, see #36431. This is not a phase 2 blocker since it doesn't apply to phase 2.

cc @nvanbenschoten

@nvanbenschoten
Copy link
Member

I'm starting to have second thoughts about marking this as a blocker. In fact, I'm starting to reconsider doing anything here at all.

Disabling uncertainty intervals because we know that there will only be one gateway per tenant seems like a performance optimization, not a fix for a regression. In a large enough cluster today, most KV RPCs will be remote, so the utility of the observed timestamp optimization in question already quickly falls off. So with regard to observed timestamps, a free-tier cluster is really no worse off than any other large CockroachDB cluster. Maybe that's all understood and we really do want to capitalize on the opportunity to avoid uncertainty restarts, but then we should consider this an optimization, not a blocker.

Meanwhile, if we do go forward with this change then we'll have to then back it out when we start scaling free-tier clusters. At that point, we actually will be causing a regression. So while I don't think this is a blocker for free-tier, it will be a blocker towards multi-gateway shared clusters if we make a change like this now. So do we really want to do this?

@tbg
Copy link
Member Author

tbg commented Aug 18, 2020

These are reasonable points, I'd be happy holding off here.

@andy-kimball
Copy link
Contributor

@nvanbenschoten, I know you've been recently thinking more about observed timestamps. What impact will your conclusions on the subject have on multi-tenancy?

@nvanbenschoten
Copy link
Member

I don't think anything changes here as it doesn't look like we're going to be able to remove observed timestamps. I still don't think we should do anything special for single-pod tenants for the reasons discussed above and because we have no hard guarantees around mutual exclusion.

@andy-kimball
Copy link
Contributor

Are you saying that the existing behavior is good enough, even for multi-SQL-pod clusters? Are there any code changes we'll need to do?

@nvanbenschoten
Copy link
Member

I'm saying that the existing behavior is the best we'll be able to do for multi-SQL-pod clusters, so we shouldn't do something special for single-SQL-pod clusters.

The premise of this issue (as I understand it) is that we could add in an optimization if we know that a tenant only has a single SQL pod, but to this point, I've been skeptical that doing so is a good idea.

@tbg
Copy link
Member Author

tbg commented Nov 12, 2020

The premise of this issue (as I understand it) is that we could add in an optimization if we know that a tenant only has a single SQL pod, but to this point, I've been skeptical that doing so is a good idea.

I also no longer think this is a good idea, especially since this isn't particularly true once there is vertical scaling.

@andy-kimball observed timestamps will "just work" for tenants. They will be at a slight disadvantage compared to a non-tenant as a non-tenant can mark its own node as certain automatically. To recoup that advantage, the tenant would have to use a local node as a timestamp oracle (and could then mark that node certain). It won't be worth doing so.

@nvanbenschoten
Copy link
Member

Closing for now, as there doesn't seem to be anything pressing here. If we want to revisit this, we can do so with the understanding that it is a potential new optimization for single SQL gateway deployments.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-multitenancy Related to multi-tenancy
Projects
None yet
Development

No branches or pull requests

3 participants