services/horizon: Parallelize `db reingest range` #2724

2opremio · 2020-06-22T16:16:26Z

PR Checklist

PR Structure

This PR has reasonably narrow scope (if not, break it down into smaller PRs).
This PR avoids mixing refactoring changes with feature changes (split into two PRs
otherwise).
This PR's title starts with name of package that is most changed in the PR, ex.
services/friendbot, or all or doc if the changes are broad or impact many
packages.

Thoroughness

This PR adds tests for the most critical parts of the new functionality or fixes.
I've updated any docs (developer docs, .md
files, etc... affected by this change). Take a look in the docs folder for a given service,
like this one.

Release planning

I've updated the relevant CHANGELOG (here for Horizon) if
needed with deprecations, added features, breaking changes, and DB schema changes.
I've decided if this PR requires a new major/minor version according to
semver, or if it's mainly a patch change. The PR is targeted at the next
release branch if it's not a patch change.

What

This change breaks down the ledger range to reingest in subranges which are submitted to a pre-defined number of workers, processing the subranges in parallel.

For now, the workers are simply Go routines using their own System (with their own DB connections etc ...).

In the future workers could be fully fledged Horizon instances running in multiple machines (e.g. orchestrated through Kubernetes Jobs or AWS Batch Jobs).

New flags:
--parallel-workers: [optional] if this flag is set to > 1, horizon will parallelize reingestion using the supplied number of workers
--parallel-job-size: [optional] parallel workers will run jobs processing ledger batches of the supplied size

Why

We want reingestion to be faster. Addresses #2552

Known limitations

As I mentioned, this change only applies to a single machine. Running it in multiple machines will require further work.

This PR is targetting release-1.5.0 (since that's the base branch I used). We should retarget it to master (once release-1.5.0 is merged) before merging.

services/horizon/CHANGELOG.md

bartekn · 2020-06-23T14:52:15Z

services/horizon/internal/expingest/main.go

+		case <-ps.shutdown:
+			return
+		case reingestRange := <-ps.reingestJobQueue:
+			err := s.ReingestRange(reingestRange.from, reingestRange.to, false)


ReingestRange is blocking so in case of the shutdown signal it will still need to finish it's job. Is it by design?

What's the alternative?

bartekn · 2020-06-23T14:52:54Z

services/horizon/internal/expingest/main.go

+	requestedRange ledgerRange
+}
+
+type ParallelSystems struct {


Can we move the code related to ParallelSystems to a separate file?

I did it, but it's odd that a separate file imports symbols from main.go

services/horizon/internal/expingest/main.go

2opremio · 2020-06-24T13:15:13Z

@bartekn PTAL. I addressed your comments, reincorporated --parallel-job-size (which will be needed for performance reasons when reingesting the full history) and added an extra test (TestParallelReingestRangeError())

services/horizon/internal/expingest/parallel.go

bartekn

Added a comment with a proposal to make it simpler.

bartekn · 2020-06-24T13:41:30Z

services/horizon/internal/expingest/parallel.go

+
+const (
+	historyCheckpointLedgerInterval = 64
+	minBatchSize                    = historyCheckpointLedgerInterval


This is only needed in case of state ingestion (like in verify-range). It's totally fine to ingest ranges smaller than 64 ledgers. We should remove len(range) >= 64 restriction.

bartekn · 2020-06-24T14:26:49Z

services/horizon/internal/expingest/parallel.go

+	}
+
+	return firstErr
+}


I think we can simplify it more:

A single shutdown signal should be enough to handle everything.

We don't really need reingestJobResult this can be handled inside a worker function.

We don't need global wait group. Shutdown should just sent a shutdown signal and make ReingestRange method to wrap up.

We don't need to start workers when creating a method. Maybe we'll need workers in other methods than ReingestRange but now we don't.

After implementing the changes I have the following code:

func (ps *ParallelSystems) ReingestRange(fromLedger, toLedger uint32, batchSizeSuggestion uint32) error { var reingestJobQueue = make(chan ledgerRange) var erroredMutex sync.Mutex var errored bool func markError() { erroredMutex.Lock() errored = true erroredMutex.Unlock() } wg.Add(1) // can be moved to a method go func() { defer wg.Done() for subRangeFrom := fromLedger; subRangeFrom < toLedger; { // job queuing subRangeTo := subRangeFrom + (batchSize - 1) // we subtract one because both from and to are part of the batch if subRangeTo > toLedger { subRangeTo = toLedger } select { case <-ps.shutdown: return case reingestJobQueue <- ledgerRange{subRangeFrom, subRangeTo}: } subRangeFrom = subRangeTo + 1 } }() for i := 0; i < workers; i++ { wg.Add(1) // can be moved to a method go func() { defer wg.Done() s, err := systemFactory(config) if err != nil { log.Error("...") s.Shutdown() markError() return } for { select { case <-ps.shutdown: return case reingestRange := <-reingestJobQueue: err := s.ReingestRange(reingestRange.from, reingestRange.to, false) if err != nil { log.Error("...") s.Shutdown() markError() return } } } } } wg.Wait() close(reingestJobQueue) if errored { return errors.New("one or more jobs failed") } return nil } func (ps *ParallelSystems) shutdown() { close(ps.shutdown) } func (ps *ParallelSystems) Shutdown() error { // sync.Once ps.shutdownOnce.Do(msr.shutdown) return nil }

A single shutdown signal should be enough to handle everything.

Not if you have persistent workers (read below)

We don't really need reingestJobResult this can be handled inside a worker function.

Your example has oversimplified error management. If you want to give account of up until what point things where ingested properly (which I will implement shortly after this PR is merged), you need to know what ranges where successful.

We don't need global wait group. Shutdown should just sent a shutdown signal and make ReingestRange method to wrap up.

Not if you have persistent workers (read below)

We don't need to start workers when creating a method. Maybe we'll need workers in other methods than ReingestRange but now we don't.

I did it this way to that we can parallelize stress testing and normal history reingestion later on.

After implementing the changes I have the following code

I am not sure how msr or shutdownOnce are handled. Also, error management is oversimplified. As I mentioned above, you do need to consume the results to know what ranges where processed.

@bartekn and I agreed that he will commit his suggestion and we will take it from there.

bartekn · 2020-06-24T18:29:10Z

OK, I pushed my code as discussed with @fons. I haven't updated tests, will do it after 👍 from you. I also added a code that return a suggested range to restart a job in case of failure. There's a comment above lowestRangeErr explaining why (I think) it works without having to keep track of results from all the jobs.

2opremio · 2020-06-24T19:13:04Z

services/horizon/internal/expingest/parallel.go

+
+}
+
+func (ps *ParallelSystems) Shutdown() error {


I don't think it makes sense to have this anymore

In fact, having the ability to shutdown an in-progress reingestion will break error reporting (there won't be any guarantee of the lowest failing ledger range being reported since any remaining jobs can be aborted without reporting an error)

Closing shutdown channel won't abort running jobs. If there's a running ReingestRange method in any worker it will complete (succeed or fail but will complete). Also, because the queue channel is unbuffered you won't have any remaining jobs in a buffer (the go routine adding new jobs will return due to <- ps.shutdown). Even if this was true, all jobs will complete because there's a wait group that blocks until all go routines return.

Can you elaborate?

I think you are right, but I believe it's a bit brittle (prone to break if the code changes). It's probably a good idea to add a test to make it future-proof (a test in which there is an error and one of the pending jobs also errors). I am happy to add that.

Leaving that aside, I think we agree that Shutdown() is not needed anymore since there is no need to cancel running operations or cleanup anything.

2opremio · 2020-06-24T19:14:28Z

services/horizon/internal/expingest/parallel.go

+					}
+					lowestRangeErrMutex.Unlock()
+				}
+				ps.Shutdown()


It's awkward that an operation shuts the whole ParallelSystems down. I would remove the global shutdown channel, create one here and pass it to the workers.

2opremio · 2020-06-24T19:15:19Z

services/horizon/internal/expingest/parallel.go

+	}, nil
+}
+
+func (ps *ParallelSystems) reingestWorker(reingestJobQueue <-chan ledgerRange) error {


I would use a verb reingestWorker (e.g. runReingestWorker(), doReingestWork())

2opremio · 2020-06-24T19:17:54Z

services/horizon/internal/expingest/parallel.go

+	)
+
+	wg.Add(1)
+	go func() {


I don't think there is a reason to do this in a separate goroutine. You could run the inner for loop after spawning the workers.

2opremio · 2020-06-24T19:21:36Z

services/horizon/internal/expingest/parallel.go

+		// Because of this when we reach `wg.Wait()` all jobs previously sent to a channel are processed (either success
+		// or failure). In case of a failure we save the range with the smallest sequence number because this is where
+		// the user needs to start again to prevent the gaps.
+		lowestRangeErr *rangeError


this doesn't need to be a pointer (no strong opinions)

If it wasn't a pointer we'd need another bool variable to determine if there was an error or not.

you can use the error inside the rangeError (if it's nil then no error happened). But, as I said, no strong opinions here.

services/horizon/internal/expingest/parallel.go

bartekn · 2020-06-25T00:24:16Z

@fons thanks for a quick review! I added the requested changes and fixed tests. It's great that we're at the same page when it comes to aborting jobs.

I removed Shutdown as requested: I agree it doesn't make sense to keep it now because we're not handling system shutdown signals at all in db reingest commands. However, I think it's worth doing this in a future so that users get a status update when they terminate the process manually.

I won't be available in the morning tomorrow so please update the code if there's something that's blocking you from merging it.

services/horizon/internal/expingest/parallel.go

2opremio · 2020-06-25T09:56:22Z

I think we are done. This PR targets the 1.5.0 branch, but my plan is to hold off from merging to avoid disrupting the 1.5.0 release, and just wait until 1.5.0 is merged into master.

But, if @ire-and-curses / @abuiles are OK with it, I will merge now.

ire-and-curses · 2020-06-25T16:47:08Z

Let's target this for 1.6 release.

This change breaks down the ledger range to reingest in subranges which are submitted to a pre-defined number of workers, processing the subranges in parallel. For now, the workers are simply Go routines using their own `System` (with their own DB connections etc ...). In the future workers could be fully fledged Horizon instances running in multiple machines (e.g. orchestrated through Kubernetes Jobs or AWS Batch Jobs). New flags: --parallel-workers: [optional] if this flag is set to > 1, horizon will parallelize reingestion using the supplied number of workers --parallel-job-size: [optional] parallel workers will run jobs processing ledger batches of the supplied size

This reverts commit 2cd8e4e.

2opremio requested a review from a team June 22, 2020 16:16

cla-bot bot added the cla: yes label Jun 22, 2020

2opremio force-pushed the parallelize-ingestion branch 3 times, most recently from 867cc8a to a56a33d Compare June 23, 2020 14:08

bartekn reviewed Jun 23, 2020

View reviewed changes

tamirms reviewed Jun 24, 2020

View reviewed changes

services/horizon/internal/expingest/parallel.go Outdated Show resolved Hide resolved

2opremio force-pushed the parallelize-ingestion branch 2 times, most recently from 07200dc to 64f44a6 Compare June 24, 2020 13:46

tamirms reviewed Jun 24, 2020

View reviewed changes

services/horizon/internal/expingest/parallel.go Outdated Show resolved Hide resolved

2opremio commented Jun 24, 2020

View reviewed changes

services/horizon/internal/expingest/parallel.go Outdated Show resolved Hide resolved

2opremio mentioned this pull request Jun 24, 2020

Fast txmeta: implement and profile parallel ingestion #2552

Closed

5 tasks

bartekn reviewed Jun 24, 2020

View reviewed changes

2opremio commented Jun 24, 2020

View reviewed changes

services/horizon/internal/expingest/parallel.go Outdated Show resolved Hide resolved

bartekn approved these changes Jun 25, 2020

View reviewed changes

2opremio commented Jun 25, 2020

View reviewed changes

services/horizon/internal/expingest/parallel.go Outdated Show resolved Hide resolved

2opremio commented Jun 25, 2020

View reviewed changes

services/horizon/internal/expingest/parallel.go Outdated Show resolved Hide resolved

2opremio commented Jun 25, 2020

View reviewed changes

services/horizon/internal/expingest/parallel.go Outdated Show resolved Hide resolved

2opremio force-pushed the parallelize-ingestion branch 2 times, most recently from 5d41abd to db51fba Compare June 25, 2020 09:50

2opremio force-pushed the parallelize-ingestion branch from 24f08a5 to 9713de2 Compare June 29, 2020 13:26

Base automatically changed from release-horizon-v1.5.0 to master June 29, 2020 19:22

2opremio changed the base branch from master to release-horizon-v1.6.0 June 29, 2020 19:28

2opremio and others added 23 commits June 29, 2020 21:34

Remove job size (the larger, the faster it is)

55cc50f

Test batch size calculation

71916fc

Update CHANGELOG

9d7cdc1

Appease go vet

55b3339

Address review feedback

0a6bb1e

Revert "Remove job size (the larger, the faster it is)"

295a506

This reverts commit 2cd8e4e.

Use a large default job size

e9a5f96

Forgot to fix the off-by-one issue

f4c8d4b

Wait for pending jobs

71142d1

Update

347c20e

Review updates & tests

8358a9f

Fix static check

fec4a16

Fix channel leak

d60b97f

Tweak and simplify error management

a59a471

Add extra test for error ordering

5447f9a

Use correct error in log

43a04ee

Tweak CHANGELOG

b9f93d8

Minor tweak

4049da8

Use a 100K default batch size as recommended by Graydon

e0ec2df

Add simple retries to ReingestRange()

1705ad1

Add successful range logging

5c89c05

Print error when retrying

526d03d

2opremio force-pushed the parallelize-ingestion branch from 9713de2 to 526d03d Compare June 29, 2020 19:34

2opremio merged commit 4419539 into release-horizon-v1.6.0 Jun 29, 2020

2opremio deleted the parallelize-ingestion branch June 29, 2020 21:20

2opremio mentioned this pull request Jul 30, 2020

Fix panic on system creation failure from db command #2872

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

services/horizon: Parallelize `db reingest range` #2724

services/horizon: Parallelize `db reingest range` #2724

2opremio commented Jun 22, 2020 •

edited

Loading

bartekn Jun 23, 2020

2opremio Jun 23, 2020

bartekn Jun 23, 2020

2opremio Jun 24, 2020

2opremio commented Jun 24, 2020

bartekn left a comment

bartekn Jun 24, 2020

bartekn Jun 24, 2020 •

edited

Loading

2opremio Jun 24, 2020 •

edited

Loading

2opremio Jun 24, 2020

bartekn commented Jun 24, 2020

2opremio Jun 24, 2020

2opremio Jun 24, 2020 •

edited

Loading

bartekn Jun 24, 2020

2opremio Jun 24, 2020 •

edited

Loading

2opremio Jun 24, 2020 •

edited

Loading

2opremio Jun 24, 2020 •

edited

Loading

2opremio Jun 24, 2020

2opremio Jun 24, 2020

bartekn Jun 24, 2020 •

edited

Loading

2opremio Jun 24, 2020

bartekn commented Jun 25, 2020

2opremio commented Jun 25, 2020 •

edited

Loading

ire-and-curses commented Jun 25, 2020

services/horizon: Parallelize db reingest range #2724

services/horizon: Parallelize db reingest range #2724

Conversation

2opremio commented Jun 22, 2020 • edited Loading

PR Structure

Thoroughness

Release planning

What

Why

Known limitations

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

2opremio commented Jun 24, 2020

bartekn left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bartekn Jun 24, 2020 • edited Loading

Choose a reason for hiding this comment

2opremio Jun 24, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bartekn commented Jun 24, 2020

Choose a reason for hiding this comment

2opremio Jun 24, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

2opremio Jun 24, 2020 • edited Loading

Choose a reason for hiding this comment

2opremio Jun 24, 2020 • edited Loading

Choose a reason for hiding this comment

2opremio Jun 24, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bartekn Jun 24, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bartekn commented Jun 25, 2020

2opremio commented Jun 25, 2020 • edited Loading

ire-and-curses commented Jun 25, 2020

services/horizon: Parallelize `db reingest range` #2724

services/horizon: Parallelize `db reingest range` #2724

2opremio commented Jun 22, 2020 •

edited

Loading

bartekn Jun 24, 2020 •

edited

Loading

2opremio Jun 24, 2020 •

edited

Loading

2opremio Jun 24, 2020 •

edited

Loading

2opremio Jun 24, 2020 •

edited

Loading

2opremio Jun 24, 2020 •

edited

Loading

2opremio Jun 24, 2020 •

edited

Loading

bartekn Jun 24, 2020 •

edited

Loading

2opremio commented Jun 25, 2020 •

edited

Loading