Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge horizon v0.24.1 into master #2056

Merged
merged 7 commits into from
Dec 17, 2019
Merged

Merge horizon v0.24.1 into master #2056

merged 7 commits into from
Dec 17, 2019

Conversation

tamirms
Copy link
Contributor

@tamirms tamirms commented Dec 13, 2019

Update master with commits from v0.24.1

bartekn and others added 4 commits December 12, 2019 18:06
…x meta (#2004)

This commit adds `exp/ingest/io.LedgerEntryChangeCache` that squashes
all the ledger entry changes. This can be later used to decrease number
of DB queries when applying them. See #2003.

Some ledgers that add a lot of changes connected to a small set of
entries are causing a performance issues because every ledger entry
change is applied to a DB.  `LedgerEntryChangeCache` solves this problem
because it makes holds a final version of a ledger entry after all the
changes. 

Before this fix, extreme cases when two accounts send a payment between
each other 1000 times in a ledger required 3000 DB updates (2000 account
changes due to payment and 500 fee meta per account). After the fix, it
requires just 2 DB updates.

Algorithm used in `LedgerEntryChangeCache` is explained below:

1. If the change is CREATED it checks if any change connected to given entry
   is already in the cache. If not, it adds CREATED change. Otherwise, if
   existing change is:
   a. CREATED it returns error because we can't add an entry that already
      exists.
   b. UPDATED it returns error because we can't add an entry that already
      exists.
   c. REMOVED it means that due to previous transitions we want to remove
      this from a DB what means that it already exists in a DB so we need to
      update the type of change to UPDATED.
2. If the change is UPDATE it checks if any change connected to given entry
   is already in the cache. If not, it adds UPDATE change. Otherwise, if
   existing change is:
   a. CREATED it means that due to previous transitions we want to create
      this in a DB what means that it doesn't exist in a DB so we need to
      update the entry but stay with CREATED type.
   b. UPDATED we simply update it with the new value.
   c. REMOVED it means that at this point in the ledger the entry is removed
      so updating it returns an error.
3. If the change is REMOVE it checks if any change connected to given entry
   is already in the cache. If not, it adds REMOVE change. Otherwise, if
   existing change is:
   a. CREATED it means that due to previous transitions we want to create
      this in a DB what means that it doesn't exist in a DB. If it was
      created and removed in the same ledger it's a noop so we remove entry
      from the cache.
   b. UPDATED we simply update it to be a REMOVE change because the UPDATE
      change means the entry exists in a DB.
   c. REMOVED it returns error because we can't remove an entry that was
      already removed.
Because CircleCI does not build tags by default, any child jobs that rely on a job that only builds on tags will also require the tag filter. In this case, the hold and publish_latest_horizon_docker_image jobs will need the filter for /^horizon-v.*/ tags as well, otherwise, they will not be picked up for running.
This commit changes `io.LedgerTransaction` to return fee and tx meta
separately and updates Horizon processors to apply changes in correct
order. This issue was found by `StateVerifier`.

The order of applying meta changes in Horizon processors was incorrect.
Fee changes must be applied before everything else [1]. In other words
instead of processing meta like:

TX1_FEE_META, TX1_TX_META, TX2_FEE_META, TX2_TX_META, ...

we should do it like:

TX1_FEE_META, TX2_FEE_META, TX1_TX_META, TX2_TX_META, ...

### Known limitations

The current interface of pipeline processor doesn't make sense because
all transactions need to be read into memory first to apply fee changes.
We either need to refactor the processors or use @tamirms design where
pipeline is removed.

[1] https://github.com/stellar/stellar-core/blob/master/docs/integration.md
…orical data when starting ingestion (#2055)

Horizon always reingests state using the latest checkpoint ledger and in most cases the sequence of this ledger will be smaller than the latest ingested ledger. Because of this Horizon will attempt to insert old ledgers into exp_history_ledger again. Inserting old ledgers into exp_history_ledger triggers unique constraint errors.

To fix this issue we remove rows from historical ingestion tables that ingestion will insert during the ledger pipeline phase.
@cla-bot cla-bot bot added the cla: yes label Dec 13, 2019
@tamirms tamirms requested a review from bartekn December 13, 2019 14:35
@bartekn
Copy link
Contributor

bartekn commented Dec 13, 2019

LGTM, but let's merge it when Horizon 0.24.1 is released. It's possible that we'll have more commits (ex. I'm planning to fix #2046).

bartekn and others added 2 commits December 16, 2019 13:07
This commit fixes data race in `exp/ingest/pipeline` that can occur when
`LiveSession` (and Horizon) is shut down.

It also removes `updateStats` method that was known to have a data race
(see comment in that method). It is not actively used right now but was
being reported by race detector.

Previous code handling shutdown signal in `LiveSession` can be found below:
```
 errChan := s.LedgerPipeline.Process(ledgerReader) 
 select { 
 case err2 := <-errChan: 
 	if err2 != nil { 
 		// Return with no errors if pipeline shutdown 
 		if err2 == pipeline.ErrShutdown { 
 			s.LedgerReporter.OnEndLedger(nil, true) 
 			return nil 
 		} 
  
 		if s.LedgerReporter != nil { 
 			s.LedgerReporter.OnEndLedger(err2, false) 
 		} 
 		return errors.Wrap(err2, "Ledger pipeline errored") 
 	} 
 case <-s.standardSession.shutdown: 
 	if s.LedgerReporter != nil { 
 		s.LedgerReporter.OnEndLedger(nil, true) 
 	} 
 	s.LedgerPipeline.Shutdown() 
 	return nil 
 } 
```

The problem is when shutdown signal is received, `Resume` returns `nil`
so Horizon starts it's shutdown code which calls `Rollback()` (using
internal `tx` object) but at the same time pipeline is still running
until the code receiving from `ctx.Done` channel is executed. It means
that pipeline processors can execute transactions using `tx` transaction
object in DB session.

To fix this:
1. We don't `select` ingest session shutdown signal when waiting for
   pipeline to finish processing.
2. Instead we call `Shutdown` on pipelines inside `LiveSession.Shutdown`.
3. Then we wait/block until pipelines gracefully shutdown by calling
   `Pipeline.IsRunning` method.
4. Finally we `close(s.shutdown)` inside `expingest/System.Shutdown()`.

So the components now shut down exactly in the following order:
1. Pipelines.
2. Session.
3. Horizon Expingest System.

One comment on `-1` change in tests. When `ingestSession.Run()` returns
`nil` we shouldn't continue to `ingestSession.Resume()` because `nil`
value means that session ended. I updated the comment in `LiveSession`
and also fixed Horizon code.
@tamirms tamirms changed the base branch from release-horizon-v0.25.0 to master December 16, 2019 20:10
@tamirms tamirms changed the title Merge horizon v0.24.1 into v0.25.0 Merge horizon v0.24.1 into master Dec 16, 2019
@bartekn bartekn force-pushed the release-horizon-v0.24.1 branch from 2f0950e to 2834fa5 Compare December 17, 2019 12:19
@bartekn bartekn merged commit f1d2f89 into master Dec 17, 2019
@bartekn bartekn deleted the release-horizon-v0.24.1 branch December 17, 2019 12:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants