Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fatal error: concurrent map read and map write #22

Closed
3 tasks done
Tracked by #1016
czarcas7ic opened this issue Feb 21, 2022 · 5 comments
Closed
3 tasks done
Tracked by #1016

fatal error: concurrent map read and map write #22

czarcas7ic opened this issue Feb 21, 2022 · 5 comments
Assignees
Labels
bug Something isn't working ⚙️ task

Comments

@czarcas7ic
Copy link
Member

czarcas7ic commented Feb 21, 2022

Background

When creating a block explorer on a archive node, pruning nothing, with big dipper v2 (using bdjuno), the following error occurs at what seems like random intervals:

https://gist.github.com/rkben/8814376162cda8c08dd33d60f6933429#file-cleand_osmo-log-L256

I instructed mp20 to remove the snapshot interval in case bdjuno reading the db files while the snapshot being created is causing this issue

Acceptance Criteria

  • Investigate the problem
  • create a pr with a potential fix
  • create a test tag and distribute to the user who reported the problem
@p0mvn p0mvn moved this to 🔍 Needs Review in Osmosis Chain Development Feb 21, 2022
@p0mvn p0mvn added the bug Something isn't working label Feb 21, 2022
@p0mvn p0mvn self-assigned this Feb 21, 2022
@p0mvn
Copy link
Member

p0mvn commented Feb 21, 2022

@czarcas7ic thanks for creating it

From the initial investigation, there seems to be a concurrency issue with accessing a common resource inside nodedb.

https://gist.github.com/rkben/8814376162cda8c08dd33d60f6933429#file-cleand_osmo-log-L1257 - this goroutine seems to be committing a block
https://gist.github.com/rkben/8814376162cda8c08dd33d60f6933429#file-cleand_osmo-log-L268 - this goroutine seems to be handling the grpc query

They both access the same resource and fatal error. Investigating further...

@p0mvn
Copy link
Member

p0mvn commented Feb 27, 2022

This should be merged now, closing

@p0mvn p0mvn closed this as completed Feb 27, 2022
@p0mvn
Copy link
Member

p0mvn commented Mar 1, 2022

The solution deadlocked during testing, reopening

@p0mvn p0mvn reopened this Mar 1, 2022
Repository owner moved this from ✅ Done to 🏃 In Progress in Osmosis Chain Development Mar 1, 2022
@p0mvn
Copy link
Member

p0mvn commented Mar 10, 2022

The solution is now released under 7.0.4. Waiting for feedback from validators before closing this.

@p0mvn p0mvn moved this from 🏃 In Progress to ✅ Done in Osmosis Chain Development Mar 11, 2022
@p0mvn
Copy link
Member

p0mvn commented Mar 11, 2022

The fix is working even with the concurrent gateway, closing this issue for now

@p0mvn p0mvn closed this as completed Mar 11, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working ⚙️ task
Projects
Archived in project
Development

No branches or pull requests

3 participants