Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ingest/ledgerbackend: Remove returning error on Stellar-Core process exit during catchup #3260

Merged
merged 2 commits into from
Dec 2, 2020

Conversation

bartekn
Copy link
Contributor

@bartekn bartekn commented Dec 1, 2020

PR Checklist

PR Structure

  • This PR has reasonably narrow scope (if not, break it down into smaller PRs).
  • This PR avoids mixing refactoring changes with feature changes (split into two PRs
    otherwise).
  • This PR's title starts with name of package that is most changed in the PR, ex.
    services/friendbot, or all or doc if the changes are broad or impact many
    packages.

Thoroughness

  • This PR adds tests for the most critical parts of the new functionality or fixes.
  • I've updated any docs (developer docs, .md
    files, etc... affected by this change). Take a look in the docs folder for a given service,
    like this one.

Release planning

  • I've updated the relevant CHANGELOG (here for Horizon) if
    needed with deprecations, added features, breaking changes, and DB schema changes.
  • I've decided if this PR requires a new major/minor version according to
    semver, or if it's mainly a patch change. The PR is targeted at the next
    release branch if it's not a patch change.

Fixes a bug introduced in a10c000 because of which PrepareRange and GetLedger methods could return an error after Stellar-Core process exit but before all ledgers are read from the buffer. To fix it we now handle process exit in bufferedLedgerMetaReader only and only in case of errors. In PrepareRange we return error only on Stellar-Core exit with error. This won't work with user initiated shutdown, it will be fixed in #3258.

Thanks for the report, @Isaiah-Turner!

@cla-bot cla-bot bot added the cla: yes label Dec 1, 2020
@bartekn bartekn requested a review from a team December 1, 2020 16:20
@bartekn bartekn marked this pull request as ready for review December 1, 2020 16:20
@@ -99,6 +99,10 @@ func (b *bufferedLedgerMetaReader) readLedgerMetaFromPipe() (*xdr.LedgerCloseMet
// Wait for LedgerCloseMeta buffer to be cleared to minimize memory usage.
select {
case <-b.runner.getProcessExitChan():
if untilSequence != 0 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we also handle <-b.runner.getProcessExitChan a few lines below. Should we also have a untilSequence != 0 check in that block of code?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like getProcessExitChan() is handled a few lines above as well

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We only handle the other instances of getProcessExitChan if an error occurs (so it ignores cases when core is done but there are still ledgers in a buffer). However, I think there's a bug. In for frameLength > metaPipeBufferSize && len(b.c) > 0 we should return an error if core was closed with an error. Going to add it in a new commit.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK fixed in aecfd75.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we still need to worry about the case where len(b.c) > 0 but frameLength <= metaPipeBufferSize ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In such case (and on Core exit) we won't enter into a for loop but the xdr.Unmarshal below will error and we will handle the error there.

@bartekn bartekn merged commit 57178a0 into stellar:master Dec 2, 2020
@bartekn bartekn deleted the fix-captive-catchup-shutdown branch December 2, 2020 14:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants