Partner Re-Ingestion issue #4319

jcx120 · 2022-04-04T21:53:00Z

Partner is experiencing issues with Reingestion and Ingestion for their Horizon instance (on Azure PostGres instance)

Re-ingestion:

Unable to re-ingest full history without failure of this type: (Instance "03")

/usr/bin/stellar-horizon db reingest range 2 40325759 --parallel-workers 10 --retries 10

Using these spec's:

We were using Standard_D48s_v3 (48 vCPUs, 192GB RAM) with 42 workers and 10 retries , when encountered this issue:

And consistently getting this error:

time="2022-04-04T21:43:00.487Z" level=error msg="error in reingest worker" error="error when processing [11496322, 11596289] range: error preparing range: Error fast-forwarding to 11496322: error reading frame length: unmarshalling XDR frame header: xdr:DecodeUint: EOF while decoding 4 bytes - read: '[]'" pid=1 service=ingest

job failed, recommended restart range: [10696578, 40325759]: error when processing [10696578, 10796545] range: error preparing range: Error fast-forwarding to 10696578: error reading frame length: unmarshalling XDR frame header: xdr:DecodeUint: EOF while decoding 4 bytes - read: '[]'

Ingestion (synch) always behind and unable to catch up (Instance "01", "02") and lagging by >25 ledgers consistently

Specs:

stellar-ingest and stellar-reingest containers use 256GB Premium SSD with up to 1100 IOPS and 128 MBps throughput. The disks are mounted as volumes inside containers and used as CAPTIVE_CORE_STORAGE_PATH.
postgres container uses 16TB Premium SSD with up to 18000 IOPS and 750 MBps throughput. The disk is mounted as volume inside the container and used by PostgreSQL instance to store horizon’s database.

2opremio · 2022-04-06T14:38:39Z

Related: #4255

The parsing error is probably a red herring (and likely caused by stellar core crashing, my guess is it's OOM-killed)

2opremio · 2022-04-06T16:38:02Z

Note that there is an issue in Core version > 18.2.0 and < 18.5.0 which breaks reingestion.

See stellar/stellar-core#3360

Please upgrade to Core 18.5.0 or downgrade to Core 18.2.0

AlexeyShchukinSecurrency · 2022-04-14T17:57:01Z

After the upgrade to Core 18.5.0 and horizon 2.15.1, we're not experiencing this issue anymore. Thank you!

jcx120 added the support label Apr 4, 2022

jcx120 mentioned this issue Apr 5, 2022

Parallel Ingestion broken (Horizon vers 2.15.1) #4321

Closed

jcx120 changed the title ~~Partner Ingestion issue~~ Partner Re-Ingestion issue Apr 5, 2022

2opremio self-assigned this Apr 6, 2022

jcx120 closed this as completed Apr 14, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Partner Re-Ingestion issue #4319

Partner Re-Ingestion issue #4319

jcx120 commented Apr 4, 2022

2opremio commented Apr 6, 2022

2opremio commented Apr 6, 2022

AlexeyShchukinSecurrency commented Apr 14, 2022

Partner Re-Ingestion issue #4319

Partner Re-Ingestion issue #4319

Comments

jcx120 commented Apr 4, 2022

2opremio commented Apr 6, 2022

2opremio commented Apr 6, 2022

AlexeyShchukinSecurrency commented Apr 14, 2022