Restart the submission interpretation in case of a race [DPP-737] #11552

gerolf-da · 2021-11-04T14:12:38Z

Command interpretation happens in two stages:

Daml interpretation
Determine a suitable ledger effective time

The race can happen in the following situation:
In 1) a contract key K is resolved to contract C1.
Between 1) and 2), a transaction is stored on the participant that
archives C1, but creates C2 with the same contract key K.
In 2) the ledger api server tries to lookup the ledger effective time
for all input contracts to the transaction. If it doesn't find one of
the input contracts at all, it can conclude that the transaction
wouldn't be accepted by the ledger anyway.

The behavior before this patch was to simply abort command
interpretation and return a rather cryptic error to the user.
With this patch, the ledger api server restarts the command
interpretation.
If the "missing contract" was an explicit input to the
command (e.g. as the contract for the exercise or as an argument to an
exercise), then this command will be rejected because the contract is
now archived.
If the contract ID was determined via a contract key lookup, then
restarting the interpretation will result in either a negative lookup or
a different contract ID for the new contract under this contract key.

CHANGELOG_BEGIN
[Ledger API] Retry the interpretation of a command in case of a race
with other transactions. This fix drastically reduces the likelihood of the error
"Could not find a suitable ledger time after 0 retries".
CHANGELOG_END

Pull Request Checklist

Read and understand the contribution guidelines
Include appropriate tests
Set a descriptive title and thorough description
Add a reference to the issue this PR will solve, if appropriate
Include changelog additions in one or more commit message bodies between the CHANGELOG_BEGIN and CHANGELOG_END tags
Normal production system change, include purpose of change in description
If you mean to change the status of a component, please make sure you keep the Component Status page up to date.

NOTE: CI is not automatically run on non-members pull-requests for security
reasons. The reviewer will have to comment with /AzurePipelines run to
trigger the build.

...gration-api/src/main/scala/platform/apiserver/execution/LedgerTimeAwareCommandExecutor.scala

...ration-api/src/main/scala/platform/store/backend/common/ContractStorageBackendTemplate.scala

Command interpretation happens in two stages: 1) Daml interpretation 2) Determine a suitable ledger effective time The race can happen in the following situation: In 1) a contract key K is resolved to contract C1. Between 1) and 2), a transaction is stored on the participant that archives C1, but creates C2 with the same contract key K. In 2) the ledger api server tries to lookup the ledger effective time for all input contracts to the transaction. If it doesn't find one of the input contracts at all, it can conclude that the transaction wouldn't be accepted by the ledger anyway. The behavior before this patch was to simply abort command interpretation and return a rather cryptic error to the user. With this patch, the ledger api server restarts the command interpretation. If the "missing contract" was an explicit input to the command (e.g. as the contract for the exercise or as an argument to an exercise), then this command will be rejected because the contract is now archived. If the contract ID was determined via a contract key lookup, then restarting the interpretation will result in either a negative lookup or a different contract ID for the new contract under this contract key. CHANGELOG_BEGIN [Ledger API] Retry the interpretation of a command in case of a race with other transactions. This fix drastically reduces the likelihood of the error "Could not find a suitable ledger time after 0 retries". CHANGELOG_END

I thought we had types that the compiler checks at compile time :( CHANGELOG_BEGIN CHANGELOG_END

...gration-api/src/main/scala/platform/apiserver/execution/LedgerTimeAwareCommandExecutor.scala

mziolekda · 2021-11-08T10:51:54Z

.../sandbox-classic/src/main/scala/platform/sandbox/stores/ledger/inmemory/InMemoryLedger.scala

+          for {
+            let <- letE
+            acc <- acc
+          } yield acc.map(acc => if (let.isAfter(acc)) let else acc)


Aren't we supposed to collect all the missing ids into the accumulator's left?

mziolekda · 2021-11-08T11:16:25Z

...i/src/test/suite/scala/platform/apiserver/execution/LedgerTimeAwareCommandExecutorSpec.scala

+        )
+      }
+
+      "retry if the contract's LET is in the future and then retry if the contract is missing" in {


Nice, this is precisely what I was looking for

tudor-da · 2021-11-08T12:49:08Z

.../sandbox-classic/src/main/scala/platform/sandbox/stores/ledger/inmemory/InMemoryLedger.scala

+    Future.fromTry(Try(this.synchronized {
+      contractIds
+        .foldLeft[Either[Set[ContractId], Option[Instant]]](Right(Some(Instant.MIN)))((acc, id) => {
+          val letE = acs.activeContracts.get(id).map(c => Right(c.let)).getOrElse(Left(Set(id)))


If contractIds is empty, this function returns Right(Some(Instant.MIN)) - shouldn't it be Right(None)?

I see this is how it was before as well. Fine not to change then.

gerolf-da · 2021-11-08T16:03:29Z

Setting it back to draft. We'd rather go with the exception-based implementation in #11579.

gerolf-da force-pushed the gerolf_retry-on-ledger-time-lookup-failure branch 2 times, most recently from 6c81d8b to c73e338 Compare November 4, 2021 15:46

mziolekda reviewed Nov 4, 2021

View reviewed changes

...gration-api/src/main/scala/platform/apiserver/execution/LedgerTimeAwareCommandExecutor.scala Show resolved Hide resolved

mziolekda reviewed Nov 4, 2021

View reviewed changes

...gration-api/src/main/scala/platform/apiserver/execution/LedgerTimeAwareCommandExecutor.scala Outdated Show resolved Hide resolved

mziolekda reviewed Nov 4, 2021

View reviewed changes

...ration-api/src/main/scala/platform/store/backend/common/ContractStorageBackendTemplate.scala Show resolved Hide resolved

mziolekda changed the title ~~Restart the submission interpretation in case of a race~~ Restart the submission interpretation in case of a race [DPP-737] Nov 4, 2021

gerolf-da added 5 commits November 8, 2021 10:42

Fix some tests and MutableCacheBackedContractStore

63abee8

I thought we had types that the compiler checks at compile time :( CHANGELOG_BEGIN CHANGELOG_END

Simplify recover block

f5bd7af

Compatibility with scala 2.12

2d3b8ed

Better logging in LedgerTimeAwareCommandExecutor

a9c351b

gerolf-da force-pushed the gerolf_retry-on-ledger-time-lookup-failure branch from 7d140ad to a9c351b Compare November 8, 2021 09:46

format

2f56b7f

gerolf-da marked this pull request as ready for review November 8, 2021 10:09

gerolf-da requested review from meiersi-da and a team as code owners November 8, 2021 10:09

tudor-da reviewed Nov 8, 2021

View reviewed changes

...gration-api/src/main/scala/platform/apiserver/execution/LedgerTimeAwareCommandExecutor.scala Show resolved Hide resolved

mziolekda reviewed Nov 8, 2021

View reviewed changes

tudor-da mentioned this pull request Nov 8, 2021

Graceful handling of contracts not found during maximumLedgerTime lookup #11568

Closed

7 tasks

tudor-da reviewed Nov 8, 2021

View reviewed changes

gerolf-da mentioned this pull request Nov 8, 2021

Restart the submission interpretation in case of a race [DPP-737] #11579

Merged

7 tasks

gerolf-da marked this pull request as draft November 8, 2021 16:02

gerolf-da closed this Nov 9, 2021

gerolf-da deleted the gerolf_retry-on-ledger-time-lookup-failure branch November 9, 2021 09:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Restart the submission interpretation in case of a race [DPP-737] #11552

Restart the submission interpretation in case of a race [DPP-737] #11552

gerolf-da commented Nov 4, 2021

mziolekda Nov 8, 2021

mziolekda Nov 8, 2021

tudor-da Nov 8, 2021

tudor-da Nov 8, 2021

gerolf-da commented Nov 8, 2021

Restart the submission interpretation in case of a race [DPP-737] #11552

Restart the submission interpretation in case of a race [DPP-737] #11552

Conversation

gerolf-da commented Nov 4, 2021

Pull Request Checklist

mziolekda Nov 8, 2021

Choose a reason for hiding this comment

mziolekda Nov 8, 2021

Choose a reason for hiding this comment

tudor-da Nov 8, 2021

Choose a reason for hiding this comment

tudor-da Nov 8, 2021

Choose a reason for hiding this comment

gerolf-da commented Nov 8, 2021