[Node Operator Question] Node unable to sync after restart #596

wrinkledeth · 2024-09-22T16:34:36Z

wrinkledeth
Sep 22, 2024

Are you running the most up to date node software?

Yes

Did you check the documentation?

Yes

Did you check for duplicate questions?

Yes

Issue Description

I followed this guide and was able to sync an optimism mainnet node using op-node / op-geth.

From here, I wanted to modify the L1 RPC provide in .env. After editing .env, I restarted the node with docker compose up -d --build, but the node is unable to sync now and is stuck at the same block number.

Am I restarting the node improperly, do I need to do a full resync, or is there another issue that I am missing?

Protocol Description

op-geth:v1.101408.0
op-node:v1.9.1
network name: OP Mainnet
Chainid: 10
L1 Chain: Eth Mainnet

Node Logs

op geth logs:

op-geth-1       | INFO [09-20|15:15:46.003] Looking for peers                        peercount=2 tried=39  static=0
op-geth-1       | INFO [09-20|15:15:56.138] Looking for peers                        peercount=2 tried=131 static=0
op-geth-1       | INFO [09-20|15:16:06.205] Looking for peers                        peercount=2 tried=115 static=0
op-geth-1       | INFO [09-20|15:16:16.256] Looking for peers                        peercount=2 tried=99  static=0
op-geth-1       | INFO [09-20|15:16:26.270] Looking for peers                        peercount=2 tried=104 static=0
op-geth-1       | INFO [09-20|15:16:36.276] Looking for peers                        peercount=2 tried=42  static=0
op-geth-1       | INFO [09-20|16:09:45.530] New local node record                    seq=1,726,012,889,530 id=2ea3717e99a50fe7 ip=18.218.223.21 udp=39393 tcp=39393
op-geth-1       | INFO [09-21|05:44:18.506] Starting work on payload                 id=0x037488d27ac1aa79
op-geth-1       | WARN [09-21|05:44:18.510] Ignoring already known beacon payload    number=125,578,968 hash=8c5220..b3ba27 age=1d15h5m
op-geth-1       | ERROR[09-21|05:44:18.529] Failed to create sealing context         err="missing trie node 02eb5ace2690ad37780c49966d94d294ca51d1c0bc5af9080efeba3d150e1d38 (path ) state 0x02eb5ace2690ad37780c49966d94d294ca51d1c0bc5af9080efeba3d150e1d38 is not available, not found"
op-geth-1       | ERROR[09-21|05:44:18.529] Failed to build payload                  err="missing trie node 02eb5ace2690ad37780c49966d94d294ca51d1c0bc5af9080efeba3d150e1d38 (path ) state 0x02eb5ace2690ad37780c49966d94d294ca51d1c0bc5af9080efeba3d150e1d38 is not available, not found"
op-geth-1       | WARN [09-21|05:44:18.529] Served engine_forkchoiceUpdatedV3        conn=172.18.0.9:57376 reqid=723628 duration=5.844707ms err="Invalid payload attributes" errdata="{\"err\":\"missing trie node 02eb5ace2690ad37780c49966d94d294ca51d1c0bc5af9080efeba3d150e1d38 (path ) state 0x02eb5ace2690ad37780c49966d94d294ca51d1c0bc5af9080efeba3d150e1d38 is not available, not found\"}"
op-geth-1       | ERROR[09-21|06:31:23.638] Failed to create sealing context         err="missing trie node 8d0f0de87757725641449631498228ea32e5623bf942624559b7feff98da14dd (path ) state 0x8d0f0de87757725641449631498228ea32e5623bf942624559b7feff98da14dd is not available, not found"
op-geth-1       | ERROR[09-21|06:31:23.638] Failed to build payload                  err="missing trie node 8d0f0de87757725641449631498228ea32e5623bf942624559b7feff98da14dd (path ) state 0x8d0f0de87757725641449631498228ea32e5623bf942624559b7feff98da14dd is not available, not found"
op-geth-1       | WARN [09-21|06:31:23.638] Served engine_forkchoiceUpdatedV3        conn=172.18.0.9:57792 reqid=723636 duration=4.038112ms err="Invalid payload attributes" errdata="{\"err\":\"missing trie node 8d0f0de87757725641449631498228ea32e5623bf942624559b7feff98da14dd (path ) state 0x8d0f0de87757725641449631498228ea32e5623bf942624559b7feff98da14dd is not available, not found\"}"

op node logs:

ubuntu@ip-10-0-218-91:~/simple-optimism-node$ docker compose logs -f --tail=0 | grep op-node
op-node-1      | t=2024-09-22T16:35:43+0000 lvl=info msg="Received signed execution payload from p2p" id=0xf76599fbe56ab8b9704e6beaca6f08d79fff91b73d0dfb819cbdd413233dce05:125712083 peer=16Uiu2HAmTg5gF4QWPdWCKmj4evc7W61Sks931usn3CyEaMqZS62P txs=6
op-node-1      | t=2024-09-22T16:35:43+0000 lvl=info msg="Optimistically queueing unsafe L2 execution payload" id=0xf76599fbe56ab8b9704e6beaca6f08d79fff91b73d0dfb819cbdd413233dce05:125712083
op-node-1      | t=2024-09-22T16:35:45+0000 lvl=info msg="Received signed execution payload from p2p" id=0xea1bb71418ceca120089977fcdc01b8c103ad184e8230dfea381077d47893ff2:125712084 peer=16Uiu2HAmTg5gF4QWPdWCKmj4evc7W61Sks931usn3CyEaMqZS62P txs=16
op-node-1      | t=2024-09-22T16:35:45+0000 lvl=info msg="Optimistically queueing unsafe L2 execution payload" id=0xea1bb71418ceca120089977fcdc01b8c103ad184e8230dfea381077d47893ff2:125712084
op-node-1      | t=2024-09-22T16:35:45+0000 lvl=info msg="Dropping payload from payload queue because the payload queue is too large" id=0xa715762e12d8ad8e0fab9b51b8cf21ca0fdee3d29e7fbc1545157ca98816845a:125679786
op-node-1      | t=2024-09-22T16:35:45+0000 lvl=info msg="Dropping payload from payload queue because the payload queue is too large" id=0x553355ae1eeb90ca6e91c72acc99d600aaee55fbd499eaf747a63187df7d4dc3:125679787
op-node-1      | t=2024-09-22T16:35:45+0000 lvl=info msg="Dropping payload from payload queue because the payload queue is too large"

Additional Information

I double checked and confirmed that the L1 RPC url I provided in the .env is working properly.
No response

Answered by Chomtana

Sep 25, 2024

It appears that --force-recreate is performing a forced shutdown instead of a soft restart, which risks database corruption.

I have tested that using only --build is sufficient, as it will softly restart the containers if there is an upgrade.

docker compose up -d --build

Thank you. I will need to update the README documentation accordingly.

View full answer

Chomtana · 2024-09-24T03:20:24Z

Chomtana
Sep 24, 2024

I think database is corrupted on node shutdown.

Do you see any of

Unclean Shutdown
State schema set to default

Log lines in geth?

0 replies

Chomtana · 2024-09-24T03:21:29Z

Chomtana
Sep 24, 2024

So, if the database is corrupted, the only way is to resync from scratch, which shouldn't take more than one day with snap sync.

0 replies

wrinkledeth · 2024-09-25T01:36:15Z

wrinkledeth
Sep 25, 2024
Author

Thank you for the response @Chomtana

Unfortunately, I did not see those lines in the geth logs. For now I will do a resync.

I restarted my node with the following command docker compose up -d --build --force-recreate, should I be doing it a different way to avoid database corruption?

2 replies

Chomtana Sep 25, 2024

It appears that --force-recreate is performing a forced shutdown instead of a soft restart, which risks database corruption.

I have tested that using only --build is sufficient, as it will softly restart the containers if there is an upgrade.

docker compose up -d --build

Thank you. I will need to update the README documentation accordingly.

Answer selected by wrinkledeth

wrinkledeth Sep 28, 2024
Author

Thanks! That explains it

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Node Operator Question] Node unable to sync after restart #596

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 3 comments 2 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

[Node Operator Question] Node unable to sync after restart #596

wrinkledeth Sep 22, 2024

Are you running the most up to date node software?

Did you check the documentation?

Did you check for duplicate questions?

Issue Description

Protocol Description

Node Logs

Additional Information

Replies: 3 comments · 2 replies

Chomtana Sep 24, 2024

Chomtana Sep 24, 2024

wrinkledeth Sep 25, 2024 Author

Chomtana Sep 25, 2024

wrinkledeth Sep 28, 2024 Author

wrinkledeth
Sep 22, 2024

Replies: 3 comments 2 replies

Chomtana
Sep 24, 2024

Chomtana
Sep 24, 2024

wrinkledeth
Sep 25, 2024
Author

wrinkledeth Sep 28, 2024
Author