update verify-range for captive core use-disk #4444

sreuland · 2022-06-30T19:55:55Z

running the verify range against current pubnet for min range of ledgers, 20,100, resulted in the verify-range image failing on AWS Batch jobs:

error reading frame length: unmarshalling XDR frame header: xdr:DecodeUint: EOF while decoding 4 bytes

the same error is showing up also on CI 'verify-range' step on most current PR's for any branch, you'll see the verify-range step failing with the frame header error when it tries to run horizon verify-range

from master or any branch on go repo, it can be replicated locally from top folder:

$ docker build -f services/horizon/docker/verify-range/Dockerfile -t stellar/horizon-verify-range services/horizon/docker/verify-range/
$ docker run -e FROM=10000063 -e TO=10000127 stellar/horizon-verify-range

per prior issues, this message means captive core has crashed, per #4255,

the root cause of that is due to cc running out of memory to store current ledger state, so the effort here is to convert the verify-range to launch ingest with cc using disk for current state instead via --captive-core-use-db=true

During testing of this PR on AWS Batch, it was also observed that the jobs ran out of disk space, this was due to verify-range hosting horizon's postgres locally on VM's root volume combined with cc storing archives and it's disk on same volume, there is a limit of about 30GB that docker allocates ephemeral space on '/'. Had to break these other f/s paths out to different volumes with external volume mounts and make corresponding changes to new AWS Job Def verify-range-c5-9xlarge-job:9

Shaptic · 2022-06-30T20:07:05Z

services/horizon/docker/verify-range/Dockerfile

@@ -12,6 +12,8 @@ RUN ["chmod", "+x", "dependencies"]
 RUN /dependencies

 ADD stellar-core.cfg /


can drop this now, right?

atm, just testing out changes to use 'on disk' cc mode on CI, to see if verify-range gets resolved, but it doesn't seem to be affecting it, it's wedged on these frame header errors.

…or to verify range

… parameter for docker build

…ify-range

Shaptic reviewed Jun 30, 2022

View reviewed changes

sreuland changed the title ~~Trying cc use disk db on verify-range~~ captive core crashing on CI verify range step Jun 30, 2022

sreuland added 8 commits July 2, 2022 14:47

trying to see if cc with use disk will work on verify range

b592dd7

get cc use-db config in place

b1cf59f

try setting horizon environment specific for p19 and core version pri…

14db29f

…or to verify range

remove unused env params in verifyrange start, added core version env…

5e06ba0

… parameter for docker build

allowed STORAGE_PATH env var override

5e25def

keep stellar.db on main disk /cc path

8122cb8

init pg in runtie rather than build time to use volume mounts

7e7ef31

fixed current directory to be git checkout directory as needed

9deb6e6

sreuland force-pushed the trying-ccdisk-verify-range branch from e6c39a6 to 9deb6e6 Compare July 2, 2022 21:48

sreuland changed the base branch from horizon-release-2.18.1 to master July 2, 2022 21:49

fixed path reference in old/new compare routine

7e5301e

sreuland changed the title ~~captive core crashing on CI verify range step~~ update verify-range for captive core use-disk Jul 5, 2022

sreuland requested a review from a team July 5, 2022 16:28

This was referenced Jul 5, 2022

Horizon: Dockerize ledgerexport to run in AWS Batch #4443

Merged

Verify-range job improvements #4452

Open

2opremio approved these changes Jul 13, 2022

View reviewed changes

Merge remote-tracking branch 'upstream/master' into trying-ccdisk-ver…

a9065f2

…ify-range

sreuland merged commit 118efe4 into stellar:master Jul 13, 2022

sreuland added a commit to sreuland/go that referenced this pull request Aug 7, 2022

update verify-range for captive core use-disk (stellar#4444)

ca6470b

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

update verify-range for captive core use-disk #4444

update verify-range for captive core use-disk #4444

sreuland commented Jun 30, 2022 •

edited

Loading

Shaptic Jun 30, 2022

sreuland Jun 30, 2022 •

edited

Loading

		@@ -12,6 +12,8 @@ RUN ["chmod", "+x", "dependencies"]
		RUN /dependencies

		ADD stellar-core.cfg /

update verify-range for captive core use-disk #4444

update verify-range for captive core use-disk #4444

Conversation

sreuland commented Jun 30, 2022 • edited Loading

Shaptic Jun 30, 2022

Choose a reason for hiding this comment

sreuland Jun 30, 2022 • edited Loading

Choose a reason for hiding this comment

sreuland commented Jun 30, 2022 •

edited

Loading

sreuland Jun 30, 2022 •

edited

Loading