Can we declare the version at which a transaction should read? #2160
Replies: 3 comments 8 replies
-
I'm not saying we should do that, but it seems to me that this can entirely be side-stepped by reading the last commit index from the signatures table. That dependency will get recorded and will be deterministic across nodes. It's of course not necessarily as up to date as the in-memory consensus view of what the commit level has reached.
I think that if use a read on signatures, we can do this. What we're saying is "this is committed and the commit has been recorded in the ledger". A very fresh, up to date variant may be "use the in-memory consensus state, but route to the primary to get a deterministic answer". |
Beta Was this translation helpful? Give feedback.
-
We now actually have another solution for precisely indicating a read dependency, thanks to * well the handler could also tell you if this read is already committed, via |
Beta Was this translation helpful? Give feedback.
-
@eddyashton Thank you for reaching out regarding the topic. (sorry for the delay in responding) I see your comment "if I ask them to read the last thing they think is globally committed, even if they are both currently at version V, they may have different knowledge of global commit progress so will return different values.". I tend to agree with this, what you are saying is that get_globally_committed is not really a "global property", it is still node-local. I am going to use the following example to ensure that I understand your statement above in quotes: Say, writes W1 and W2, and reads R1 and R2 from four clients are concurrent. Consider the case when the writes succeed. Like you say, it can indeed happen that R1 contacts node 1 (and aks for get_globally_committed) and node 1 returns W1, and R2 contacts node 2, and node 2 returns W2. Eventually, there will be a point in the execution when W1 and W2 are both globally committed, and every node will have the same order (say W1->W2, so in this example, node 1 did not yet realize that W2 was also globally committed which is I guess what you are also saying). And beyond, this point, all reads to any node will return W2 and not W1 anymore. And this seems to a perfectly acceptable behavior for the "get_globally_commited" API. (and I hope this is not a reason to deprecate get_globally_committed) I was using get_globally_committed with the only intention of ensuring that it returns values that cannot be rolled back. For instance, in the above example, it is a certainty that either W1->W2 or W2-W1 will appear eventually on the global chain. My first requirement in #1586 was that I wanted to avoid a scenario where-by R1 reads W1, but the global commit of W1 fails. I guess @achamayou had said that the "intention is to provide an alternate mechanism where ReadOnly endpoints can be called with a special parameter/header indicated they should be run as of the last globally committed version known by the target node", which I think will solve the scenario (read-read), even without the get-globally-committed rpc. (Is this part of the plan, if get_globally_committed is deprecated?) My outstanding question here and in #1586 is the read-write case. Going back to the above example, suppose both W1 and W2 are executed on the RAFT primary by two clients. Suppose W2 uses the value from W1, say W2 = W1 + 1. In this case, can it ever happen that local commit of W1 and W2 both succeed, and W2 uses W1 as noted above. But, global commit of W1 fails, however global commit of W2 succeeds. The reason why I ask this is since session consistency does not apply here since the rpcs W1 and W2 arise from two clients. If this above situation cannot happen, then I guess CCF out of the boxes offer stronger-consistency than session consistency (something like per-node consistency?), which might be worth highlighting for the app developers. So, if CCF indeed does offer per-node consistency as noted above, the only thing that I do not fully understand is the scenario when there is a primary change between writes W1 and W2. Is the argument here that if W2 is executed on a different primary than W1, and if W2 = W1 +1, then it must have been the case that W1 was globally committed before the change of primary? |
Beta Was this translation helpful? Give feedback.
-
In #1586 I mentioned we would like to deprecate
get_globally_committed
. This is still the case - I do not believe the semantics of it are well-defined or helpful in our transaction model, and we should remove it.A core problem with
get_globally_committed
is that "what is the globally committed version" is a node-local property, and may lag arbitrarily. If I ask 2 different nodes to readK
, and they both execute at versionV
, I will get the same result. But if I ask them to read the last thing they think is globally committed, even if they are both currently at versionV
, they may have different knowledge of global commit progress so will return different values. Further, they don't record what their global commit opinion was so this response doesn't correspond to anything in the ledger, and won't match when re-executed across nodes in Byzantine consensus. This is even worse if those reads are allowed to affect subsequent writes in the same transaction - we have essentially introduced non-determinism to the execution. This is almost certainly not what an app developer intends.However it is still desirable to perform reads which cannot be rolled back, so they can safely be reported. If the endpoint always reads current state but the client wants to avoid reporting anything which could be rolled back, then they need to poll for commit for each read before reporting it - this is a complication in every client SDK and increases practical request latency.
A proposed way to achieve this is to declare the intended read version in read-only requests, presumably in a header value. So the default, with no header value, would be to read the current state. But if the request contains
x-ccf-read-version: 2.25
, we will try to perform the read at version2.25
. This means the client is still responsible for advancing this read version to the current committed version, but at least the response is purely a function of the request. I don't think we can support a special versionx-ccf-read-version: committed
, since it runs into the non-determinism described above.Another route is to say that our reported read versions are too fresh, and often pessimistic if you're looking for committed state. Consider a key
A
that was last written to at10
, and a keyB
which is written to constantly by other transactions. If we read onlyA
, at version100
, the response will report a read version of100
, because this is the opaque version at which we 'could' query any map. But in practice we only read the state from10
. We could find a way to report this property as well, so that this kind of 'static' read can be safely seen as globally committed.Beta Was this translation helpful? Give feedback.
All reactions