Can we declare the version at which a transaction should read? #2160

eddyashton · 2021-02-03T11:31:45Z

eddyashton
Feb 3, 2021
Maintainer

In #1586 I mentioned we would like to deprecate get_globally_committed. This is still the case - I do not believe the semantics of it are well-defined or helpful in our transaction model, and we should remove it.

A core problem with get_globally_committed is that "what is the globally committed version" is a node-local property, and may lag arbitrarily. If I ask 2 different nodes to read K, and they both execute at version V, I will get the same result. But if I ask them to read the last thing they think is globally committed, even if they are both currently at version V, they may have different knowledge of global commit progress so will return different values. Further, they don't record what their global commit opinion was so this response doesn't correspond to anything in the ledger, and won't match when re-executed across nodes in Byzantine consensus. This is even worse if those reads are allowed to affect subsequent writes in the same transaction - we have essentially introduced non-determinism to the execution. This is almost certainly not what an app developer intends.

However it is still desirable to perform reads which cannot be rolled back, so they can safely be reported. If the endpoint always reads current state but the client wants to avoid reporting anything which could be rolled back, then they need to poll for commit for each read before reporting it - this is a complication in every client SDK and increases practical request latency.

A proposed way to achieve this is to declare the intended read version in read-only requests, presumably in a header value. So the default, with no header value, would be to read the current state. But if the request contains x-ccf-read-version: 2.25, we will try to perform the read at version 2.25. This means the client is still responsible for advancing this read version to the current committed version, but at least the response is purely a function of the request. I don't think we can support a special version x-ccf-read-version: committed, since it runs into the non-determinism described above.

Another route is to say that our reported read versions are too fresh, and often pessimistic if you're looking for committed state. Consider a key A that was last written to at 10, and a key B which is written to constantly by other transactions. If we read only A, at version 100, the response will report a read version of 100, because this is the opaque version at which we 'could' query any map. But in practice we only read the state from 10. We could find a way to report this property as well, so that this kind of 'static' read can be safely seen as globally committed.

achamayou · 2021-02-03T13:31:49Z

achamayou
Feb 3, 2021
Maintainer

A core problem with get_globally_committed is that "what is the globally committed version" is a node-local property, and may lag arbitrarily. If I ask 2 different nodes to read K, and they both execute at version V, I will get the same result. But if I ask them to read the last thing they think is globally committed, even if they are both currently at version V, they may have different knowledge of global commit progress so will return different values. Further, they don't record what their global commit opinion was so this response doesn't correspond to anything in the ledger, and won't match when re-executed across nodes in Byzantine consensus. This is even worse if those reads are allowed to affect subsequent writes in the same transaction - we have essentially introduced non-determinism to the execution. This is almost certainly not what an app developer intends.

I'm not saying we should do that, but it seems to me that this can entirely be side-stepped by reading the last commit index from the signatures table. That dependency will get recorded and will be deterministic across nodes. It's of course not necessarily as up to date as the in-memory consensus view of what the commit level has reached.

I don't think we can support a special version x-ccf-read-version: committed, since it runs into the non-determinism described above.

I think that if use a read on signatures, we can do this. What we're saying is "this is committed and the commit has been recorded in the ledger". A very fresh, up to date variant may be "use the in-memory consensus state, but route to the primary to get a deterministic answer".

4 replies

eddyashton Feb 3, 2021
Maintainer Author

If we want to do this as a 'normal' read, this would require retaining an older store entry (the version of global commit at the previous signature) after global commit and compaction. This isn't a blocking limitation, but would need some redesign. The range that we currently store (from the last global commit) is mostly hidden from the app currently - if we allow reads at historic versions like this, perhaps this becomes exposed. Or perhaps we can transparently hand off to the historical query system at this point, and this isn't a concern.

achamayou Feb 3, 2021
Maintainer

Yes, we'd either need to use a historical query, or delay compaction to use this new watermark, ie. only compact up to Y once a signature indicating Y is now globally committed has been locally committed.

I believe it is right to think of these requests as historical queries, whether we optimise for them or not by keeping more in memory. My point was that we can do something ledger-based and therefore deterministic with "committed", users don't necessarily need to pass a specific "view.seqno", although they should be able to if they so choose.

eddyashton Feb 4, 2021
Maintainer Author

Another wrinkle with "read 'committed' as a historical query". Our current historical queries are stateless enough that they can be repeated verbatim. You call read_at(20), it begins fetching 20 and says try again later, you pause and then call read_at(20) again. If you call read_at(committed), and it begins fetching 20, you can't simply call read_at(committed) again - it either needs to tell you 20 or some other handle for your request, as committed changes regularly.

And if the goal is to be deterministic from the current v (rather than the caller simply knowing a committed v and specifying that explicitly), I think we would need to repeat that signature read in each call?

More thinking out loud - hoping to build some clarity on this as we expand the historical query API.

achamayou Feb 4, 2021
Maintainer

That could be done as a redirect read_at(committed) -> 302 Location: read_at(40), which you then can poll.

eddyashton · 2021-02-04T09:21:14Z

eddyashton
Feb 4, 2021
Maintainer Author

We now actually have another solution for precisely indicating a read dependency, thanks to get_version_of_previous_write. So the framework can continue with its current headers, giving a monotonic transaction order in responses. Then if an app wants to read committed state, it can get the version of its reads (per-key, more precisely than the opacity read version that the framework reports), and return this to the client. This will still require "poll until its committed" behaviour on the client*, but at least avoids the pessimistic delay on stable reads described in the last paragraph.

* well the handler could also tell you if this read is already committed, via get_status_for_txid, to avoid an extra roundtrip on some requests, but it can't block until commit so you need to have polling logic for the general case

4 replies

achamayou Feb 4, 2021
Maintainer

I don't think this is practical because some of the reads are invisible to app code, like endpoint lookup, auth-related lookups etc.

eddyashton Feb 4, 2021
Maintainer Author

True, but I think there's still a use case where this is fine. If your concern is ensuring that you only report committed state, this is sufficient - you care about the version of the value you're reporting, and nothing else. You're not reporting who you read it as, or the state of the whole service when you read it, so its acceptable that these may be evaluated at a more recent version.

achamayou Feb 4, 2021
Maintainer

Is that right? You can report committed state on the basis of uncommitted user permission (or endpoint!) changes. If they are rolled back, the response was entirely ephemeral and can literally never be reproduced.

It is not obvious to me that there is such a thing as a value regardless of who you read it as, because authorisation is an application concern.

eddyashton Feb 4, 2021
Maintainer Author

The model of executing auth on the current state is the only one we currently provide. To evaluate auth historically (or dispatch, or more interesting general KV queries), you need the full state at a historic version, and we currently only reconstruct the writes at that version. This is sufficient for the current use cases, and I think a model that we want to support long-term. We also intend to support the opposite, where you get a full historic state and can evaluate your auth against it, but we don't support this yet and won't dictate this for every app/endpoint (this approach means new users can't read old state - that's up to the app and not us).

I think this non-reproducible response is fine for some apps, so its worth supporting/documenting as a way to fill their needs. Another possible use case is a strong requirement that your precise response is both committed and reproducible. That's harder, but I think we can get there - you need to specify your target version, and it needs to execute entirely historically.

And I don't follow the second point. While an app may restrict (or modify!) a value's visibility per-caller (in which case you'd expect an identity claim alongside any report of the read value), it doesn't have to. The app can also say a value is public to anyone, or at least the same to "all permitted users", and this can fit into a broader system which just wants to know that the service had committed this value at some point.

prakashngit · 2021-02-08T19:07:12Z

prakashngit
Feb 8, 2021

@eddyashton Thank you for reaching out regarding the topic. (sorry for the delay in responding)

I see your comment "if I ask them to read the last thing they think is globally committed, even if they are both currently at version V, they may have different knowledge of global commit progress so will return different values.". I tend to agree with this, what you are saying is that get_globally_committed is not really a "global property", it is still node-local.

I am going to use the following example to ensure that I understand your statement above in quotes:

Say, writes W1 and W2, and reads R1 and R2 from four clients are concurrent. Consider the case when the writes succeed. Like you say, it can indeed happen that R1 contacts node 1 (and aks for get_globally_committed) and node 1 returns W1, and R2 contacts node 2, and node 2 returns W2. Eventually, there will be a point in the execution when W1 and W2 are both globally committed, and every node will have the same order (say W1->W2, so in this example, node 1 did not yet realize that W2 was also globally committed which is I guess what you are also saying). And beyond, this point, all reads to any node will return W2 and not W1 anymore. And this seems to a perfectly acceptable behavior for the "get_globally_commited" API. (and I hope this is not a reason to deprecate get_globally_committed)

I was using get_globally_committed with the only intention of ensuring that it returns values that cannot be rolled back. For instance, in the above example, it is a certainty that either W1->W2 or W2-W1 will appear eventually on the global chain. My first requirement in #1586 was that I wanted to avoid a scenario where-by R1 reads W1, but the global commit of W1 fails. I guess @achamayou had said that the "intention is to provide an alternate mechanism where ReadOnly endpoints can be called with a special parameter/header indicated they should be run as of the last globally committed version known by the target node", which I think will solve the scenario (read-read), even without the get-globally-committed rpc. (Is this part of the plan, if get_globally_committed is deprecated?)

My outstanding question here and in #1586 is the read-write case. Going back to the above example, suppose both W1 and W2 are executed on the RAFT primary by two clients. Suppose W2 uses the value from W1, say W2 = W1 + 1. In this case, can it ever happen that local commit of W1 and W2 both succeed, and W2 uses W1 as noted above. But, global commit of W1 fails, however global commit of W2 succeeds. The reason why I ask this is since session consistency does not apply here since the rpcs W1 and W2 arise from two clients.
I see that in #1586 that "If A occurs after B, and thus may have read B's writes, it can only commit if B also commits. The local, non-globally committed state forms a chain, and either it is all rolled back on election or it commits. So this kind of ill-defined write is not possible, and does not need to consider global commit.", which makes me tend to believe that the above situation cannot happen.

If this above situation cannot happen, then I guess CCF out of the boxes offer stronger-consistency than session consistency (something like per-node consistency?), which might be worth highlighting for the app developers.

So, if CCF indeed does offer per-node consistency as noted above, the only thing that I do not fully understand is the scenario when there is a primary change between writes W1 and W2. Is the argument here that if W2 is executed on a different primary than W1, and if W2 = W1 +1, then it must have been the case that W1 was globally committed before the change of primary?

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can we declare the version at which a transaction should read? #2160

{{title}}

Replies: 3 comments 8 replies

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

Can we declare the version at which a transaction should read? #2160

eddyashton Feb 3, 2021 Maintainer

Replies: 3 comments · 8 replies

achamayou Feb 3, 2021 Maintainer

eddyashton Feb 3, 2021 Maintainer Author

achamayou Feb 3, 2021 Maintainer

eddyashton Feb 4, 2021 Maintainer Author

achamayou Feb 4, 2021 Maintainer

eddyashton Feb 4, 2021 Maintainer Author

achamayou Feb 4, 2021 Maintainer

eddyashton Feb 4, 2021 Maintainer Author

achamayou Feb 4, 2021 Maintainer

eddyashton Feb 4, 2021 Maintainer Author

prakashngit Feb 8, 2021

eddyashton
Feb 3, 2021
Maintainer

Replies: 3 comments 8 replies

achamayou
Feb 3, 2021
Maintainer

eddyashton Feb 3, 2021
Maintainer Author

achamayou Feb 3, 2021
Maintainer

eddyashton Feb 4, 2021
Maintainer Author

achamayou Feb 4, 2021
Maintainer

eddyashton
Feb 4, 2021
Maintainer Author

achamayou Feb 4, 2021
Maintainer

eddyashton Feb 4, 2021
Maintainer Author

achamayou Feb 4, 2021
Maintainer

eddyashton Feb 4, 2021
Maintainer Author

prakashngit
Feb 8, 2021