Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[3.5] server/auth: protect rangePermCache with a RW lock #14227

Merged
merged 1 commit into from
Jul 20, 2022

Conversation

mitake
Copy link
Contributor

@mitake mitake commented Jul 17, 2022

This PR backports #13954 to release-3.5.

Updating CHANGELOG-3.5.md in #14228

cc @ahrtr @ptabor

@@ -118,21 +121,42 @@ func (as *authStore) isRangeOpPermitted(tx backend.ReadTx, userName string, key,
return false
}
as.rangePermCache[userName] = perms
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems it follows different logic as the PR on main branch. The PR (13954) on main branch doesn't update the rangePermCache at all in isRangeOpPermitted

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for checking, I missed it... updated this logic in the latest version of this PR, could you check again?

@mitake mitake force-pushed the perm-cache-lock-3.5 branch from 380a5d6 to cb4cb90 Compare July 18, 2022 13:33
@ahrtr
Copy link
Member

ahrtr commented Jul 18, 2022

Overall looks good to me, the only minor comment is it would be better to use UnsafeForEach to implement getAllUsers, in the same way as https://github.com/etcd-io/etcd/pull/13954/files#diff-1cfe6f8b848882d840f3edd250afaf8ece52a0410e69391ff0c0593f1e43a48a .

@mitake
Copy link
Contributor Author

mitake commented Jul 19, 2022

@ahrtr ah thanks for pointing out, I'll update getAllUsers() to use UnsafeForEach().

@mitake mitake force-pushed the perm-cache-lock-3.5 branch from cb4cb90 to e15c005 Compare July 19, 2022 06:56
@mitake
Copy link
Contributor Author

mitake commented Jul 19, 2022

@ahrtr Updated getAllUsers(), could you check?

Copy link
Member

@ahrtr ahrtr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Thank you @mitake

@serathius serathius merged commit 9d7e108 into etcd-io:release-3.5 Jul 20, 2022
@vivekpatani
Copy link
Contributor

vivekpatani commented Aug 11, 2022

Not sure if this is the same problem (let me know if not).

We're running 9d7e108 and still seeing the follow errors when going from 3.4.18 -> 3.5.4. Thanks.

{"level":"error","ts":"2022-08-11T20:41:15.546Z","caller":"auth/range_perm_cache.go:114","msg":"user doesn't exist","user-name":"xyz","stacktrace":"go.etcd.io/etcd/server/v3/auth.(*authStore).isRangeOpPermitted
/workspace/server/auth/range_perm_cache.go:114
go.etcd.io/etcd/server/v3/auth.(*authStore).isOpPermitted
/workspace/server/auth/store.go:870
go.etcd.io/etcd/server/v3/auth.(*authStore).IsRangePermitted
/workspace/server/auth/store.go:882

@mitake
Copy link
Contributor Author

mitake commented Aug 15, 2022

@vivekpatani Could you share a way to reproduce? Also, does the issue happen deterministically on your env? This PR is related to a very rare non deterministic bug. If you can easily reproduce, I think it's a different problem.

@vivekpatani
Copy link
Contributor

vivekpatani commented Aug 16, 2022

@mitake @ahrtr here's a simple way to reproduce the issue, I think it's related to watches

Setup
> git clone https://github.com/etcd-io/etcd.git
> cd etcd
> git checkout release-3.4

Docker etcd server
> docker run --platform linux/amd64 -P -it --rm -v /path/to/etcd:/go/src/go.etcd.io/etcd docker.io/golang:1.16.15
> cd src/go.etcd.io/etcd
> ./build
> ./bin/etcd --logger zap --log-level debug

Docker etcd client
> docker ps -a
> docker exec -it <container_id> bash ## container ID of the server
> cd src/go.etcd.io/etcd
> Follow instructions given here: https://etcd.io/docs/v3.5/op-guide/authentication/authentication/
> password: badboi (just for reference)

Shutdown the Docker etcd server by pressing Ctrl + C and typing exit and return

Now, switch the branch to release-3.5
> git checkout release-3.5

Docker etcd server #2
> docker run --platform linux/amd64 -P -it --rm -v /path/to/etcd:/go/src/go.etcd.io/etcd docker.io/golang:1.16.15
> ./build.sh
> ./bin/etcd --log-level debug

Docker etcd client #2
> docker ps -a
> docker exec -it <container_id> bash ## container ID of the server
> cd src/go.etcd.io/etcd
> ./bin/etcdctl watch foo --user=user0:badboi

> Client side log
watch was canceled (rpc error: code = PermissionDenied desc = etcdserver: permission denied)
Error: watch is canceled by the server

> Server side log
.
.
{"level":"error","ts":"2022-08-16T04:08:30.961Z","caller":"auth/range_perm_cache.go:114","msg":"user doesn't exist","user-name":"user0","stacktrace":"go.etcd.io/etcd/server/v3/auth.(*authStore).isRangeOpPermitted\n\t/go/src/go.etcd.io/etcd/server/auth/range_perm_cache.go:114\ngo.etcd.io/etcd/server/v3/auth.(*authStore).isOpPermitted\n\t/go/src/go.etcd.io/etcd/server/auth/store.go:870\ngo.etcd.io/etcd/server/v3/auth.(*authStore).IsRangePermitted\n\t/go/src/go.etcd.io/etcd/server/auth/store.go:882\ngo.etcd.io/etcd/server/v3/etcdserver/api/v3rpc.(*serverWatchStream).isWatchPermitted\n\t/go/src/go.etcd.io/etcd/server/etcdserver/api/v3rpc/watch.go:235\ngo.etcd.io/etcd/server/v3/etcdserver/api/v3rpc.(*serverWatchStream).recvLoop\n\t/go/src/go.etcd.io/etcd/server/etcdserver/api/v3rpc/watch.go:269\ngo.etcd.io/etcd/server/v3/etcdserver/api/v3rpc.(*watchServer).Watch.func2\n\t/go/src/go.etcd.io/etcd/server/etcdserver/api/v3rpc/watch.go:189"}
{"level":"debug","ts":"2022-08-16T04:08:30.967Z","caller":"v3rpc/watch.go:191","msg":"failed to receive watch request from gRPC stream","error":"rpc error: code = Canceled desc = context canceled"}
.
.

Please let me know if you need more data/help.

@ahrtr ahrtr changed the title server/auth: protect rangePermCache with a RW lock [3.5] server/auth: protect rangePermCache with a RW lock Oct 28, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

4 participants