Captive Core's temporary db directory disappears breaking execution #3145

2opremio · 2020-10-21T17:35:22Z

While working on #3144 I encountered the following error in captive core:

time="2020-10-21T17:22:50.833Z" level=error msg="Error in ingestion state machine" current_state="resume(latestSuccessfullyProcessedLedger=1)" error="error preparing range: opening subprocess: error running stellar-core: error waiting for `stellar-core new-db` subprocess: could not start `stellar-core [new-db]` cmd: chdir /tmp/captive-stellar-core-9f8cf1da5cf913b7: no such file or directory" next_state=start pid=195 service=ingest

which happens in a loop

The text was updated successfully, but these errors were encountered:

2opremio · 2020-10-21T17:37:13Z

Could it be that the directory is removed by the system due to being in /tmp ?

2opremio · 2020-10-22T14:30:29Z

The full context is:

time="2020-10-22T14:17:34.762Z" level=error msg="Error in ingestion state machine" current_state="resume(latestSuccessfullyProcessedLedger=1)" error="Error running processors on ledger: Protocol version not supported: Error getting ledger: unexpected ledger (expected=2 actual=3)" next_state="resume(latestSuccessfullyProcessedLedger=1)" pid=199 service=ingest
time="2020-10-22T14:17:35.762Z" level=info msg="Ingestion system state machine transition" current_state="resume(latestSuccessfullyProcessedLedger=1)" next_state="resume(latestSuccessfullyProcessedLedger=1)" pid=199 service=ingest
time="2020-10-22T14:17:35.769Z" level=info msg="Released ingestion lock to prepare range" pid=199 service=ingest
time="2020-10-22T14:17:35.770Z" level=info msg="Preparing range" ledger=2 pid=199 service=ingest
time="2020-10-22T14:17:35.773Z" level=error msg="Error in ingestion state machine" current_state="resume(latestSuccessfullyProcessedLedger=1)" error="error preparing range: opening subprocess: error running stellar-core: error waiting for `stellar-core new-db` subprocess: could not start `stellar-core [new-db]` cmd: chdir /tmp/captive-stellar-core-c40ecca427b35c92: no such file or directory" next_state=start pid=199 service=ingest

2opremio · 2020-10-22T16:59:39Z

What is happening is that the captive core backend doesn't properly reset it's tmp directory on error. Fix coming up.

2opremio added fast-txmeta bug labels Oct 21, 2020

bartekn modified the milestone: Horizon 1.10.0 Oct 21, 2020

2opremio mentioned this issue Oct 21, 2020

Fix captive core integration tests #3144

Closed

2opremio self-assigned this Oct 22, 2020

This was referenced Oct 22, 2020

services/horizon: Fix captive-core's reset of its tmp dir #3154

Closed

Crash in captive-core's backend #3159

Closed

services/horizon: Make captive core runner more robust #3162

Merged

2opremio closed this as completed in #3162 Oct 27, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Captive Core's temporary db directory disappears breaking execution #3145

Captive Core's temporary db directory disappears breaking execution #3145

2opremio commented Oct 21, 2020 •

edited

Loading

2opremio commented Oct 21, 2020

2opremio commented Oct 22, 2020

2opremio commented Oct 22, 2020

Captive Core's temporary db directory disappears breaking execution #3145

Captive Core's temporary db directory disappears breaking execution #3145

Comments

2opremio commented Oct 21, 2020 • edited Loading

2opremio commented Oct 21, 2020

2opremio commented Oct 22, 2020

2opremio commented Oct 22, 2020

2opremio commented Oct 21, 2020 •

edited

Loading