Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor logsFilter to prevent concurrent map fatal errors #10672

Merged
merged 2 commits into from
Jun 11, 2024

Conversation

bretep
Copy link
Contributor

@bretep bretep commented Jun 8, 2024

Issue:

At line 129 in logsfilter.go, we had the following line of code:

_, addrOk := filter.addrs[gointerfaces.ConvertH160toAddress(eventLog.Address)]

This line caused a panic due to a fatal error:

fatal error: concurrent map read and map write

goroutine 106 [running]:
github.com/ledgerwatch/erigon/turbo/rpchelper.(*LogsFilterAggregator).distributeLog.func1({0xc009701db8?, 0x8?}, 0xc135d26050)
github.com/ledgerwatch/erigon/turbo/rpchelper/logsfilter.go:129 +0xe7
github.com/ledgerwatch/erigon/turbo/rpchelper.(*SyncMap[...]).Range(0xc009701eb0?, 0xc009701e70?)
github.com/ledgerwatch/erigon/turbo/rpchelper/subscription.go:97 +0x11a
github.com/ledgerwatch/erigon/turbo/rpchelper.(*LogsFilterAggregator).distributeLog(0x25f4600?, 0xc0000ce090?)
github.com/ledgerwatch/erigon/turbo/rpchelper/logsfilter.go:131 +0xc7
github.com/ledgerwatch/erigon/turbo/rpchelper.(*Filters).OnNewLogs(...)
github.com/ledgerwatch/erigon/turbo/rpchelper/filters.go:547
github.com/ledgerwatch/erigon/cmd/rpcdaemon/rpcservices.(*RemoteBackend).SubscribeLogs(0xc0019c2f50, {0x32f0040, 0xc001b4a280}, 0xc001c0c0e0, 0x0?)
github.com/ledgerwatch/erigon/cmd/rpcdaemon/rpcservices/eth_backend.go:227 +0x1d1
github.com/ledgerwatch/erigon/turbo/rpchelper.New.func2()
github.com/ledgerwatch/erigon/turbo/rpchelper/filters.go:102 +0xec
created by github.com/ledgerwatch/erigon/turbo/rpchelper.New
github.com/ledgerwatch/erigon/turbo/rpchelper/filters.go:92 +0x652

This error indicates that there were simultaneous read and write operations on the filter.addrs map, leading to a race condition.

Solution:

To resolve this issue, I implemented the following changes:

  • Moved SyncMap to erigon-lib common library: This allows us to utilize a thread-safe map across different packages that require synchronized map access.
  • Refactored logsFilter to use SyncMap: By replacing the standard map with SyncMap, we ensured that all map operations are thread-safe, thus preventing concurrent read and write errors.
  • Added documentation for SyncMap usage: Detailed documentation was provided to guide the usage of SyncMap and related refactored components, ensuring clarity and proper utilization.

@bretep
Copy link
Contributor Author

bretep commented Jun 9, 2024

Looks like the tests failed. I'll take a look and see what needs to be fixed, either tests or code.

--- FAIL: TestFilters_SubscribeLogsGeneratesCorrectLogFilterRequest (0.00s)
    filters_test.go:328: 5: expected addresses to be empty
    filters_test.go:344: 5: expected addresses to be empty
    filters_test.go:347: 5: expected topics to be empty
FAIL
FAIL	github.com/ledgerwatch/erigon/turbo/rpchelper	0.089s

map fatal errors

- Moved SyncMap to erigon-lib common library for use across packages
  requiring thread-safe maps.
- Refactored logsFilter to use SyncMap, addressing and fixing fatal
  errors related to concurrent map read and map write.
- Added documentation for SyncMap usage and related refactored
  components.
@bretep bretep force-pushed the refactor-logsFilter-fix-panic branch from 4ff7d59 to 7d550c1 Compare June 10, 2024 05:43
@bretep
Copy link
Contributor Author

bretep commented Jun 10, 2024

@AskAlexSharov

  • Updated and enhanced tests.
  • Fixed deadlocks by temporarily replacing "sync" with "github.com/sasha-s/go-deadlock".
    Noting here the tool I used for reference only. It's not in the codebase.
  • Resolved an issue with SyncMap not properly removing items.
  • Improved code readability and added additional documentation.

@bretep bretep force-pushed the refactor-logsFilter-fix-panic branch from 7d550c1 to 1c8bb90 Compare June 10, 2024 05:54
@AskAlexSharov AskAlexSharov merged commit 2f2ce6a into erigontech:main Jun 11, 2024
9 checks passed
@bretep bretep deleted the refactor-logsFilter-fix-panic branch June 11, 2024 01:59
taratorio pushed a commit that referenced this pull request Sep 5, 2024
#### Issue:
At line 129 in `logsfilter.go`, we had the following line of code:
```go
_, addrOk := filter.addrs[gointerfaces.ConvertH160toAddress(eventLog.Address)]
```


This line caused a panic due to a fatal error:
```logs
fatal error: concurrent map read and map write

goroutine 106 [running]:
github.com/ledgerwatch/erigon/turbo/rpchelper.(*LogsFilterAggregator).distributeLog.func1({0xc009701db8?, 0x8?}, 0xc135d26050)
github.com/ledgerwatch/erigon/turbo/rpchelper/logsfilter.go:129 +0xe7
github.com/ledgerwatch/erigon/turbo/rpchelper.(*SyncMap[...]).Range(0xc009701eb0?, 0xc009701e70?)
github.com/ledgerwatch/erigon/turbo/rpchelper/subscription.go:97 +0x11a
github.com/ledgerwatch/erigon/turbo/rpchelper.(*LogsFilterAggregator).distributeLog(0x25f4600?, 0xc0000ce090?)
github.com/ledgerwatch/erigon/turbo/rpchelper/logsfilter.go:131 +0xc7
github.com/ledgerwatch/erigon/turbo/rpchelper.(*Filters).OnNewLogs(...)
github.com/ledgerwatch/erigon/turbo/rpchelper/filters.go:547
github.com/ledgerwatch/erigon/cmd/rpcdaemon/rpcservices.(*RemoteBackend).SubscribeLogs(0xc0019c2f50, {0x32f0040, 0xc001b4a280}, 0xc001c0c0e0, 0x0?)
github.com/ledgerwatch/erigon/cmd/rpcdaemon/rpcservices/eth_backend.go:227 +0x1d1
github.com/ledgerwatch/erigon/turbo/rpchelper.New.func2()
github.com/ledgerwatch/erigon/turbo/rpchelper/filters.go:102 +0xec
created by github.com/ledgerwatch/erigon/turbo/rpchelper.New
github.com/ledgerwatch/erigon/turbo/rpchelper/filters.go:92 +0x652
```

This error indicates that there were simultaneous read and write
operations on the `filter.addrs` map, leading to a race condition.


#### Solution:
To resolve this issue, I implemented the following changes:

- Moved SyncMap to erigon-lib common library: This allows us to utilize
a thread-safe map across different packages that require synchronized
map access.
- Refactored logsFilter to use SyncMap: By replacing the standard map
with SyncMap, we ensured that all map operations are thread-safe, thus
preventing concurrent read and write errors.
- Added documentation for SyncMap usage: Detailed documentation was
provided to guide the usage of SyncMap and related refactored
components, ensuring clarity and proper utilization.
@taratorio taratorio added this to the v2.60.7-fixes milestone Sep 5, 2024
taratorio pushed a commit that referenced this pull request Sep 5, 2024
#### Issue:
At line 129 in `logsfilter.go`, we had the following line of code:
```go
_, addrOk := filter.addrs[gointerfaces.ConvertH160toAddress(eventLog.Address)]
```


This line caused a panic due to a fatal error:
```logs
fatal error: concurrent map read and map write

goroutine 106 [running]:
github.com/ledgerwatch/erigon/turbo/rpchelper.(*LogsFilterAggregator).distributeLog.func1({0xc009701db8?, 0x8?}, 0xc135d26050)
github.com/ledgerwatch/erigon/turbo/rpchelper/logsfilter.go:129 +0xe7
github.com/ledgerwatch/erigon/turbo/rpchelper.(*SyncMap[...]).Range(0xc009701eb0?, 0xc009701e70?)
github.com/ledgerwatch/erigon/turbo/rpchelper/subscription.go:97 +0x11a
github.com/ledgerwatch/erigon/turbo/rpchelper.(*LogsFilterAggregator).distributeLog(0x25f4600?, 0xc0000ce090?)
github.com/ledgerwatch/erigon/turbo/rpchelper/logsfilter.go:131 +0xc7
github.com/ledgerwatch/erigon/turbo/rpchelper.(*Filters).OnNewLogs(...)
github.com/ledgerwatch/erigon/turbo/rpchelper/filters.go:547
github.com/ledgerwatch/erigon/cmd/rpcdaemon/rpcservices.(*RemoteBackend).SubscribeLogs(0xc0019c2f50, {0x32f0040, 0xc001b4a280}, 0xc001c0c0e0, 0x0?)
github.com/ledgerwatch/erigon/cmd/rpcdaemon/rpcservices/eth_backend.go:227 +0x1d1
github.com/ledgerwatch/erigon/turbo/rpchelper.New.func2()
github.com/ledgerwatch/erigon/turbo/rpchelper/filters.go:102 +0xec
created by github.com/ledgerwatch/erigon/turbo/rpchelper.New
github.com/ledgerwatch/erigon/turbo/rpchelper/filters.go:92 +0x652
```

This error indicates that there were simultaneous read and write
operations on the `filter.addrs` map, leading to a race condition.


#### Solution:
To resolve this issue, I implemented the following changes:

- Moved SyncMap to erigon-lib common library: This allows us to utilize
a thread-safe map across different packages that require synchronized
map access.
- Refactored logsFilter to use SyncMap: By replacing the standard map
with SyncMap, we ensured that all map operations are thread-safe, thus
preventing concurrent read and write errors.
- Added documentation for SyncMap usage: Detailed documentation was
provided to guide the usage of SyncMap and related refactored
components, ensuring clarity and proper utilization.
taratorio added a commit that referenced this pull request Sep 5, 2024
…11892)

relates to #11890
cherry pick from E3
[2f2ce6a](2f2ce6a)

-----

#### Issue:
At line 129 in `logsfilter.go`, we had the following line of code:
```go
_, addrOk := filter.addrs[gointerfaces.ConvertH160toAddress(eventLog.Address)]
```


This line caused a panic due to a fatal error:
```logs
fatal error: concurrent map read and map write

goroutine 106 [running]:
github.com/ledgerwatch/erigon/turbo/rpchelper.(*LogsFilterAggregator).distributeLog.func1({0xc009701db8?, 0x8?}, 0xc135d26050)
github.com/ledgerwatch/erigon/turbo/rpchelper/logsfilter.go:129 +0xe7
github.com/ledgerwatch/erigon/turbo/rpchelper.(*SyncMap[...]).Range(0xc009701eb0?, 0xc009701e70?)
github.com/ledgerwatch/erigon/turbo/rpchelper/subscription.go:97 +0x11a
github.com/ledgerwatch/erigon/turbo/rpchelper.(*LogsFilterAggregator).distributeLog(0x25f4600?, 0xc0000ce090?)
github.com/ledgerwatch/erigon/turbo/rpchelper/logsfilter.go:131 +0xc7
github.com/ledgerwatch/erigon/turbo/rpchelper.(*Filters).OnNewLogs(...)
github.com/ledgerwatch/erigon/turbo/rpchelper/filters.go:547
github.com/ledgerwatch/erigon/cmd/rpcdaemon/rpcservices.(*RemoteBackend).SubscribeLogs(0xc0019c2f50, {0x32f0040, 0xc001b4a280}, 0xc001c0c0e0, 0x0?)
github.com/ledgerwatch/erigon/cmd/rpcdaemon/rpcservices/eth_backend.go:227 +0x1d1
github.com/ledgerwatch/erigon/turbo/rpchelper.New.func2()
github.com/ledgerwatch/erigon/turbo/rpchelper/filters.go:102 +0xec
created by github.com/ledgerwatch/erigon/turbo/rpchelper.New
github.com/ledgerwatch/erigon/turbo/rpchelper/filters.go:92 +0x652
```

This error indicates that there were simultaneous read and write
operations on the `filter.addrs` map, leading to a race condition.


#### Solution:
To resolve this issue, I implemented the following changes:

- Moved SyncMap to erigon-lib common library: This allows us to utilize
a thread-safe map across different packages that require synchronized
map access.
- Refactored logsFilter to use SyncMap: By replacing the standard map
with SyncMap, we ensured that all map operations are thread-safe, thus
preventing concurrent read and write errors.
- Added documentation for SyncMap usage: Detailed documentation was
provided to guide the usage of SyncMap and related refactored
components, ensuring clarity and proper utilization.

Co-authored-by: Bret <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants