Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Restart the submission interpretation in case of a race [DPP-737] #11552

Closed
wants to merge 6 commits into from

Conversation

gerolf-da
Copy link
Contributor

Command interpretation happens in two stages:

  1. Daml interpretation
  2. Determine a suitable ledger effective time

The race can happen in the following situation:
In 1) a contract key K is resolved to contract C1.
Between 1) and 2), a transaction is stored on the participant that
archives C1, but creates C2 with the same contract key K.
In 2) the ledger api server tries to lookup the ledger effective time
for all input contracts to the transaction. If it doesn't find one of
the input contracts at all, it can conclude that the transaction
wouldn't be accepted by the ledger anyway.

The behavior before this patch was to simply abort command
interpretation and return a rather cryptic error to the user.
With this patch, the ledger api server restarts the command
interpretation.
If the "missing contract" was an explicit input to the
command (e.g. as the contract for the exercise or as an argument to an
exercise), then this command will be rejected because the contract is
now archived.
If the contract ID was determined via a contract key lookup, then
restarting the interpretation will result in either a negative lookup or
a different contract ID for the new contract under this contract key.

CHANGELOG_BEGIN
[Ledger API] Retry the interpretation of a command in case of a race
with other transactions. This fix drastically reduces the likelihood of the error
"Could not find a suitable ledger time after 0 retries".
CHANGELOG_END

Pull Request Checklist

  • Read and understand the contribution guidelines
  • Include appropriate tests
  • Set a descriptive title and thorough description
  • Add a reference to the issue this PR will solve, if appropriate
  • Include changelog additions in one or more commit message bodies between the CHANGELOG_BEGIN and CHANGELOG_END tags
  • Normal production system change, include purpose of change in description
  • If you mean to change the status of a component, please make sure you keep the Component Status page up to date.

NOTE: CI is not automatically run on non-members pull-requests for security
reasons. The reviewer will have to comment with /AzurePipelines run to
trigger the build.

@gerolf-da gerolf-da force-pushed the gerolf_retry-on-ledger-time-lookup-failure branch 2 times, most recently from 6c81d8b to c73e338 Compare November 4, 2021 15:46
@mziolekda mziolekda changed the title Restart the submission interpretation in case of a race Restart the submission interpretation in case of a race [DPP-737] Nov 4, 2021
Command interpretation happens in two stages:
1) Daml interpretation
2) Determine a suitable ledger effective time

The race can happen in the following situation:
In 1) a contract key K is resolved to contract C1.
Between 1) and 2), a transaction is stored on the participant that
archives C1, but creates C2 with the same contract key K.
In 2) the ledger api server tries to lookup the ledger effective time
for all input contracts to the transaction. If it doesn't find one of
the input contracts at all, it can conclude that the transaction
wouldn't be accepted by the ledger anyway.

The behavior before this patch was to simply abort command
interpretation and return a rather cryptic error to the user.
With this patch, the ledger api server restarts the command
interpretation.
If the "missing contract" was an explicit input to the
command (e.g. as the contract for the exercise or as an argument to an
exercise), then this command will be rejected because the contract is
now archived.
If the contract ID was determined via a contract key lookup, then
restarting the interpretation will result in either a negative lookup or
a different contract ID for the new contract under this contract key.

CHANGELOG_BEGIN
[Ledger API] Retry the interpretation of a command in case of a race
with other transactions. This fix drastically reduces the likelihood of the error
"Could not find a suitable ledger time after 0 retries".
CHANGELOG_END
I thought we had types that the compiler checks at compile time :(

CHANGELOG_BEGIN
CHANGELOG_END
@gerolf-da gerolf-da force-pushed the gerolf_retry-on-ledger-time-lookup-failure branch from 7d140ad to a9c351b Compare November 8, 2021 09:46
@gerolf-da gerolf-da marked this pull request as ready for review November 8, 2021 10:09
@gerolf-da gerolf-da requested review from meiersi-da and a team as code owners November 8, 2021 10:09
for {
let <- letE
acc <- acc
} yield acc.map(acc => if (let.isAfter(acc)) let else acc)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aren't we supposed to collect all the missing ids into the accumulator's left?

)
}

"retry if the contract's LET is in the future and then retry if the contract is missing" in {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, this is precisely what I was looking for

Future.fromTry(Try(this.synchronized {
contractIds
.foldLeft[Either[Set[ContractId], Option[Instant]]](Right(Some(Instant.MIN)))((acc, id) => {
val letE = acs.activeContracts.get(id).map(c => Right(c.let)).getOrElse(Left(Set(id)))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If contractIds is empty, this function returns Right(Some(Instant.MIN)) - shouldn't it be Right(None)?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see this is how it was before as well. Fine not to change then.

@gerolf-da
Copy link
Contributor Author

Setting it back to draft. We'd rather go with the exception-based implementation in #11579.

@gerolf-da gerolf-da closed this Nov 9, 2021
@gerolf-da gerolf-da deleted the gerolf_retry-on-ledger-time-lookup-failure branch November 9, 2021 09:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants