Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Horizon doesn't enforce retention policy (HISTORY_RETENTION_COUNT environment variable) #3711

Closed
crypto-billy opened this issue Jun 22, 2021 · 4 comments · Fixed by #3777
Closed
Labels

Comments

@crypto-billy
Copy link

What version are you using?

Horizon: 2.3.0-5029e28d1ec6272a44f5c03ad732059b2fead31d
Core: stellar-core 17.1.0 (fbc0325759ff75dd250cb5e175978669cdb4e90a)
go: go1.16.3

What did you do?

Horizon was started with the following environment variables, including HISTORY_RETENTION_COUNT=200000:

APPLY_MIGRATIONS=true
CAPTIVE_CORE_CONFIG_APPEND_PATH=/opt/stellar/stellar-captive-core-stub.toml
CAPTIVE_CORE_STORAGE_PATH=/var/stellar/core-data
DATABASE_URL=postgres://postgres:{{ postgres_password }}@horizon-postgres:5432/horizon?sslmode=disable
ENABLE_CAPTIVE_CORE_INGESTION=true
HISTORY_ARCHIVE_URLS=https://history.stellar.org/prd/core-live/core_live_001
HISTORY_RETENTION_COUNT=200000
INGEST=true
NETWORK_PASSPHRASE=Public Global Stellar Network ; September 2015
PARALLEL_JOB_SIZE=100000
PER_HOUR_RATE_LIMIT=0
RETRIES=10
RETRY_BACKOFF_SECONDS=20
STELLAR_CORE_BINARY_PATH=/usr/bin/stellar-core
STELLAR_CORE_URL=http://localhost:11626

What did you expect to see?

Only latest 200,000 blocks are available on node, and disk consumption to stop growing.

What did you see instead?

The disk space consumption of the node has reached over 500GB, checked via API to find that the eldest_block was around 600,000 blocks away from block tip. Seems like db reaping did not occur at all.

Performed horizon db reap and was given the following output, the new_elder block was respective to the HISTORY_RETENTION_COUNT, which was 200,000 blocks away from the block tip:

INFO[2021-06-21T08:40:48.056Z] reaper: clearing                              new_elder=35800188 pid=315
INFO[2021-06-21T10:22:05.403Z] reaper succeeded                              new_elder=35800188 pid=315

I suppose this suggests that the retention configuration was set in place, but for some reason the reaper did not act on it for whatever reason?

Current workaround

Manually perform horizon db reap, and stop horizon+core to perform a Postgres vacuum to reclaim disk space.

@bartekn
Copy link
Contributor

bartekn commented Jun 22, 2021

Reaping is performed every one hour. Have you waited one hour?

@crypto-billy
Copy link
Author

@bartekn Yes the node has been running for several weeks, it was started by manually ingesting the most recent (at the time) 200,000 blocks.

@bartekn
Copy link
Contributor

bartekn commented Jun 23, 2021

Thanks! It's possible this was broken by #3567. We're going to check this. EDIT: after checking your logs I'm pretty sure it got broken by #3567. It took around 1.5h to clear ledgers. We probably need to move reap.Tick out of the main app Tick method.

@leevlad
Copy link

leevlad commented Jun 30, 2021

related: #3728

I believe this was broken by the 10-second timeout on the shared context in the app ticker.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants