Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

services/horizon: Add local last ingested ledger metric #2537

Merged
merged 3 commits into from
May 4, 2020

Conversation

2opremio
Copy link
Contributor

@2opremio 2opremio commented May 1, 2020

PR Checklist

PR Structure

  • This PR has reasonably narrow scope (if not, break it down into smaller PRs).
  • This PR avoids mixing refactoring changes with feature changes (split into two PRs
    otherwise).
  • This PR's title starts with name of package that is most changed in the PR, ex.
    services/friendbot, or all or doc if the changes are broad or impact many
    packages.

Thoroughness

  • This PR adds tests for the most critical parts of the new functionality or fixes.
  • I've updated any docs (developer docs, .md
    files, etc... affected by this change). Take a look in the docs folder for a given service,
    like this one.

Release planning

  • I've updated the relevant CHANGELOG (here for Horizon) if
    needed with deprecations, added features, breaking changes, and DB schema changes.
  • I've decided if this PR requires a new major/minor version according to
    semver, or if it's mainly a patch change. The PR is targeted at the next
    release branch if it's not a patch change.

What

Add a Prometheus metric for the latest local (order book graph) ledger being ingested.

Why

Fixes #2530

Known limitations

N/A

@cla-bot cla-bot bot added the cla: yes label May 1, 2020
@tamirms
Copy link
Contributor

tamirms commented May 4, 2020

there is a case in distributed ingestion where the database is up to date with the latest ledger but the in memory orderbook graphs on some nodes are still behind. In this scenario, completeIngestion() is not called when updating just the orderbooks:

if ingestLedger <= lastIngestedLedger {
// rollback because we will not be updating the DB
// so there is no need to hold on to the distributed lock
// and thereby block the other nodes from ingesting
if err = s.historyQ.Rollback(); err != nil {
return retryResume(r), errors.Wrap(err, "Error rolling back transaction")
}
log.WithFields(logpkg.F{
"sequence": ingestLedger,
"state": false,
"ledger": false,
"graph": true,
"commit": false,
}).Info("Processing ledger")
var stats io.StatsChangeProcessorResults
stats, err = s.runner.RunOrderBookProcessorOnLedger(ingestLedger)
if err != nil {
return retryResume(r), errors.Wrap(err, "Error running change processor on ledger")
}
if err = s.graph.Apply(ingestLedger); err != nil {
return retryResume(r), errors.Wrap(err, "Error applying graph changes from ledger")
}
duration := time.Since(startTime)
s.Metrics.LedgerInMemoryIngestionTimer.Update(duration)
log.
WithFields(stats.Map()).
WithFields(logpkg.F{
"sequence": ingestLedger,
"duration": duration.Seconds(),
"state": false,
"ledger": false,
"graph": true,
"commit": false,
}).
Info("Processed ledger")
return resumeImmediately(ingestLedger), nil
}

So, if we update the metrics in completeIngestion() we will not be capturing nodes which have out of date orderbooks. To fix this we can refactor the code so that completeIngestion() is called even when only updating orderbooks. Or, we can update the promethes metric in two places (once in completeIngestion() and the other time in the special case of the resume state).

Don't store it in the database since, as I understand it,
instances will fight each other.
@2opremio 2opremio force-pushed the 2530-add-local-last-ingested-ledger branch from b536053 to bbf5cf4 Compare May 4, 2020 12:15
@2opremio 2opremio merged commit 86685b5 into master May 4, 2020
@2opremio 2opremio deleted the 2530-add-local-last-ingested-ledger branch May 4, 2020 12:24
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add local last ingested ledger sequence to /metrics
2 participants