Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data race in cluster setup using Olric v0.5.5 #247

Closed
zhp007 opened this issue May 2, 2024 · 1 comment
Closed

Data race in cluster setup using Olric v0.5.5 #247

zhp007 opened this issue May 2, 2024 · 1 comment
Assignees
Labels
bug Something isn't working

Comments

@zhp007
Copy link

zhp007 commented May 2, 2024

In the olric_test.go, add the following sample test:

func TestOlricCluster(t *testing.T) {
	cluster := newTestOlricCluster(t)
	cluster.addMember(t)
	cluster.addMember(t)
	cluster.addMember(t)
	cluster.addMember(t)
	cluster.addMember(t)
	require.Len(t, cluster.members, 5)
}

Then run the command:

for i in {1..10}; do go test -count=1 -race -run 'TestOlricCluster' ; done

Some rounds will fail with the following error message:

WARNING: DATA RACE
Write at 0x00c0001f0348 by goroutine 253:
  github.com/buraksezer/olric/internal/cluster/routingtable.(*RoutingTable).Start()
      /Users/zhengrenpan/open-source/olric/internal/cluster/routingtable/routingtable.go:370 +0x3d0
  github.com/buraksezer/olric.(*Olric).Start()
      /Users/zhengrenpan/open-source/olric/olric.go:345 +0x32c
  github.com/buraksezer/olric.newTestOlricWithConfig.func2()
      /Users/zhengrenpan/open-source/olric/olric_test.go:61 +0x2c

Previous read at 0x00c0001f0348 by goroutine 278:
  github.com/buraksezer/olric/internal/cluster/routingtable.(*RoutingTable).setOwnedPartitionCount()
      /Users/zhengrenpan/open-source/olric/internal/cluster/routingtable/routingtable.go:159 +0xcc
  github.com/buraksezer/olric/internal/cluster/routingtable.(*RoutingTable).updateRoutingCommandHandler()
      /Users/zhengrenpan/open-source/olric/internal/cluster/routingtable/operations.go:107 +0x608
  github.com/buraksezer/olric/internal/cluster/routingtable.(*RoutingTable).updateRoutingCommandHandler-fm()
      <autogenerated>:1 +0x90
  github.com/buraksezer/olric/internal/server.Handler.ServeRESP()
      /Users/zhengrenpan/open-source/olric/internal/server/handler.go:57 +0x3ec
  github.com/buraksezer/olric/internal/server.(*Handler).ServeRESP()
      <autogenerated>:1 +0xa4
  github.com/buraksezer/olric/internal/server.(*ServeMux).ServeRESP()
      /Users/zhengrenpan/open-source/olric/internal/server/mux.go:73 +0x3ec
  github.com/buraksezer/olric/internal/server.(*ServeMux).ServeRESP-fm()
      <autogenerated>:1 +0x90
  github.com/tidwall/redcon.handle.func2()
      /Users/zhengrenpan/go/pkg/mod/github.com/tidwall/[email protected]/redcon.go:437 +0x2b8
  github.com/tidwall/redcon.handle()
      /Users/zhengrenpan/go/pkg/mod/github.com/tidwall/[email protected]/redcon.go:450 +0x9c
  github.com/tidwall/redcon.serve.func2()
      /Users/zhengrenpan/go/pkg/mod/github.com/tidwall/[email protected]/redcon.go:386 +0x40

Goroutine 253 (running) created at:
  github.com/buraksezer/olric.newTestOlricWithConfig()
      /Users/zhengrenpan/open-source/olric/olric_test.go:60 +0x2d4
  github.com/buraksezer/olric.(*testOlricCluster).addMemberWithConfig()
      /Users/zhengrenpan/open-source/olric/olric_test.go:108 +0x298
  github.com/buraksezer/olric.(*testOlricCluster).addMember()
      /Users/zhengrenpan/open-source/olric/olric_test.go:115 +0x78
  github.com/buraksezer/olric.TestOlricCluster()
      /Users/zhengrenpan/open-source/olric/olric_test.go:146 +0x68
  testing.tRunner()
      /opt/homebrew/Cellar/go/1.21.6/libexec/src/testing/testing.go:1595 +0x1b0
  testing.(*T).Run.func1()
      /opt/homebrew/Cellar/go/1.21.6/libexec/src/testing/testing.go:1648 +0x40

Goroutine 278 (running) created at:
  github.com/tidwall/redcon.serve()
      /Users/zhengrenpan/go/pkg/mod/github.com/tidwall/[email protected]/redcon.go:386 +0x688
  github.com/tidwall/redcon.(*Server).Serve()
      /Users/zhengrenpan/go/pkg/mod/github.com/tidwall/[email protected]/redcon.go:317 +0x158
  github.com/buraksezer/olric/internal/server.(*Server).ListenAndServe()
      /Users/zhengrenpan/open-source/olric/internal/server/server.go:189 +0x598
  github.com/buraksezer/olric.(*Olric).Start.func1()
      /Users/zhengrenpan/open-source/olric/olric.go:324 +0x40
  golang.org/x/sync/errgroup.(*Group).Go.func1()
      /Users/zhengrenpan/go/pkg/mod/golang.org/x/[email protected]/errgroup/errgroup.go:75 +0x7c
@buraksezer buraksezer self-assigned this May 2, 2024
@buraksezer buraksezer added the bug Something isn't working label May 2, 2024
buraksezer added a commit that referenced this issue May 10, 2024
The code has been updated to include a join process before starting the routing table in internal/cluster/routingtable/ and internal/testcluster/testcluster.go. This ensures that the command handlers of the routing table service wait for the cluster join event. Additionally, any join process failure in RoutingTable is now handled properly. The changes also include the introduction of 'ErrNotJoinedYet' error in the discovery.go file to handle 'not joined yet' situations.
buraksezer added a commit that referenced this issue May 10, 2024
The code has been updated to include a join process before starting the routing table in internal/cluster/routingtable/ and internal/testcluster/testcluster.go. This ensures that the command handlers of the routing table service wait for the cluster join event. Additionally, any join process failure in RoutingTable is now handled properly. The changes also include the introduction of 'ErrNotJoinedYet' error in the discovery.go file to handle 'not joined yet' situations.
@buraksezer
Copy link
Owner

Hey @zhp007,

Thank you for reporting this. The problem has been fixed, and Olric v0.5.6 has been released. https://github.com/buraksezer/olric/releases/tag/v0.5.6

The current solution fixes the problem, but we need more complicated refactoring to improve the code quality. This can be done in v0.6.0.

buraksezer added a commit that referenced this issue May 10, 2024
The code has been updated to include a join process before starting the routing table in internal/cluster/routingtable/ and internal/testcluster/testcluster.go. This ensures that the command handlers of the routing table service wait for the cluster join event. Additionally, any join process failure in RoutingTable is now handled properly. The changes also include the introduction of 'ErrNotJoinedYet' error in the discovery.go file to handle 'not joined yet' situations.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants