-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
kv: observed timestamps and multi-tenancy #48008
Comments
I'm starting to have second thoughts about marking this as a blocker. In fact, I'm starting to reconsider doing anything here at all. Disabling uncertainty intervals because we know that there will only be one gateway per tenant seems like a performance optimization, not a fix for a regression. In a large enough cluster today, most KV RPCs will be remote, so the utility of the observed timestamp optimization in question already quickly falls off. So with regard to observed timestamps, a free-tier cluster is really no worse off than any other large CockroachDB cluster. Maybe that's all understood and we really do want to capitalize on the opportunity to avoid uncertainty restarts, but then we should consider this an optimization, not a blocker. Meanwhile, if we do go forward with this change then we'll have to then back it out when we start scaling free-tier clusters. At that point, we actually will be causing a regression. So while I don't think this is a blocker for free-tier, it will be a blocker towards multi-gateway shared clusters if we make a change like this now. So do we really want to do this? |
These are reasonable points, I'd be happy holding off here. |
@nvanbenschoten, I know you've been recently thinking more about observed timestamps. What impact will your conclusions on the subject have on multi-tenancy? |
I don't think anything changes here as it doesn't look like we're going to be able to remove observed timestamps. I still don't think we should do anything special for single-pod tenants for the reasons discussed above and because we have no hard guarantees around mutual exclusion. |
Are you saying that the existing behavior is good enough, even for multi-SQL-pod clusters? Are there any code changes we'll need to do? |
I'm saying that the existing behavior is the best we'll be able to do for multi-SQL-pod clusters, so we shouldn't do something special for single-SQL-pod clusters. The premise of this issue (as I understand it) is that we could add in an optimization if we know that a tenant only has a single SQL pod, but to this point, I've been skeptical that doing so is a good idea. |
I also no longer think this is a good idea, especially since this isn't particularly true once there is vertical scaling. @andy-kimball observed timestamps will "just work" for tenants. They will be at a slight disadvantage compared to a non-tenant as a non-tenant can mark its own node as certain automatically. To recoup that advantage, the tenant would have to use a local node as a timestamp oracle (and could then mark that node certain). It won't be worth doing so. |
Closing for now, as there doesn't seem to be anything pressing here. If we want to revisit this, we can do so with the understanding that it is a potential new optimization for single SQL gateway deployments. |
Background
Observed timestamps have an important optimization: the gateway node starting a txn marks itself as certain using a timestamp off its local clock (which is shared with the local KV server).
In a tenant process, this isn’t possible (there’s no local KV server and hence no NodeID nor a shared clock).
In phase 2, there won't be more than one gateway running at any given time (per tenant), so all of the tenant's transactions will pass through that one gateway's clock anyway, meaning there shouldn't be too many new uncertainty restarts (in fact there shouldn't be any uncertainty, assuming we wait out clock uncertainty at process start).
The blocker for phase 2 is just to make sure we set the uncertainty interval that is applied to new transactions to zero.
Once we are running more than one SQL gateway per tenant the loss of observed timestamps may hurt and a solution will be needed, see #36431. This is not a phase 2 blocker since it doesn't apply to phase 2.
cc @nvanbenschoten
The text was updated successfully, but these errors were encountered: