Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problem upgrading from 1.6.2 to 1.7.1 #7395

Closed
jorgemarey opened this issue Mar 5, 2020 · 4 comments · Fixed by #7398
Closed

Problem upgrading from 1.6.2 to 1.7.1 #7395

jorgemarey opened this issue Mar 5, 2020 · 4 comments · Fixed by #7398
Assignees
Milestone

Comments

@jorgemarey
Copy link
Contributor

jorgemarey commented Mar 5, 2020

Overview of the Issue

We have a cluster of consul with the 1.6.2 version. We tried to upgrade to 1.7.1 but the following happen after upgrading the servers (we use consul-esm and vault in the cluster):

[WARN]  agent.server.kvs: Rejecting lock of key due to lock-delay: key=vault/core/lock expire_time="2020-03-05 08:39:24.131778589 +0000 UTC m=+2650.631195787"
[WARN]  agent.server.kvs: Rejecting lock of key due to lock-delay: key=vault/core/lock expire_time="2020-03-05 08:39:24.131778589 +0000 UTC m=+2650.631195787"
[WARN]  agent.server.kvs: Rejecting lock of key due to lock-delay: key=consul-esm/leader expire_time="2020-03-05 08:39:18.470971083 +0000 UTC m=+2644.970388250"
[WARN]  agent.server.kvs: Rejecting lock of key due to lock-delay: key=consul-esm/leader expire_time="2020-03-05 08:39:18.470971083 +0000 UTC m=+2644.970388250"

Seems like they couldn't perform the lock.

Also, in the nodes we see the following:

consul: 2020/03/05 09:20:48 [ERR] consul: "Session.Renew" RPC failed to server 10.172.0.6:8300: rpc error making call: rpc error making call: failed session lookup: index error: UUID must be 36 characters
consul: 2020/03/05 09:20:48 [ERR] http: Request PUT /v1/session/renew/5f4ec6ec-11bd-0ce0-7769-27872ffc172c?token=<hidden>, error: rpc error making call: rpc error making call: failed session lookup: index error: UUID must be 36 characters from=10.171.0.18:58368
consul: 2020/03/05 09:20:48 [ERR] consul: "Session.Renew" RPC failed to server 10.172.0.21:8300: rpc error making call: failed session lookup: index error: UUID must be 36 characters
consul: 2020/03/05 09:20:48 [ERR] http: Request PUT /v1/session/renew/5f4ec6ec-11bd-0ce0-7769-27872ffc172c?token=<hidden>, error: rpc error making call: failed session lookup: index error: UUID must be 36 characters from=10.171.0.18:58368
consul: 2020/03/05 09:20:48 [ERR] consul: "Session.Renew" RPC failed to server 10.172.0.2:8300: rpc error making call: rpc error making call: failed session lookup: index error: UUID must be 36 characters
consul: 2020/03/05 09:20:48 [ERR] http: Request PUT /v1/session/renew/5f4ec6ec-11bd-0ce0-7769-27872ffc172c?token=<hidden>, error: rpc error making call: rpc error making call: failed session lookup: index error: UUID must be 36 characters from=10.171.0.18:58368
consul: 2020/03/05 09:20:48 [ERR] consul: "Session.Renew" RPC failed to server 10.172.0.6:8300: rpc error making call: rpc error making call: failed session lookup: index error: UUID must be 36 characters
consul: 2020/03/05 09:20:48 [ERR] http: Request PUT /v1/session/renew/5f4ec6ec-11bd-0ce0-7769-27872ffc172c?token=<hidden>, error: rpc error making call: rpc error making call: failed session lookup: index error: UUID must be 36 characters from=10.171.0.18:58368

After upgrading the node to 1.7.1 the problems seems to go away.

I tested this in a development environment, but when we release this to producción where we have hundreds of nodes we can't have this failling until we upgrade all nodes.

Any thoughts?

Thanks!

@jorgemarey jorgemarey changed the title Problem upgrading from 1.6.1 to 1.7.1 Problem upgrading from 1.6.2 to 1.7.1 Mar 5, 2020
@hanshasselberg hanshasselberg self-assigned this Mar 5, 2020
@jorgemarey
Copy link
Contributor Author

Hi @i0rek just fount out what happened.

In the commit 7b471f6 a field of the struct SessionSpecificRequest was renamed so when talking between nodes in differents versions this fails. I fixed it by doing the following:

diff --git a/agent/consul/session_endpoint.go b/agent/consul/session_endpoint.go
index 05e2b5c43..debe2eae8 100644
--- a/agent/consul/session_endpoint.go
+++ b/agent/consul/session_endpoint.go
@@ -275,6 +275,10 @@ func (s *Session) Renew(args *structs.SessionSpecificRequest,
                return err
        }
 
+       if args.SessionID == "" {
+               args.SessionID = args.Session
+       }
+
        // Get the session, from local state.
        state := s.srv.fsm.State()
        index, session, err := state.SessionGet(nil, args.SessionID, &args.EnterpriseMeta)
diff --git a/agent/session_endpoint.go b/agent/session_endpoint.go
index 46a512f48..5a07b66aa 100644
--- a/agent/session_endpoint.go
+++ b/agent/session_endpoint.go
@@ -98,6 +98,7 @@ func (s *HTTPServer) SessionRenew(resp http.ResponseWriter, req *http.Request) (
 
        // Pull out the session id
        args.SessionID = strings.TrimPrefix(req.URL.Path, "/v1/session/renew/")
+       args.Session = args.SessionID
        if args.SessionID == "" {
                resp.WriteHeader(http.StatusBadRequest)
                fmt.Fprint(resp, "Missing session")
diff --git a/agent/structs/structs.go b/agent/structs/structs.go
index b386cc4b6..a2b241fff 100644
--- a/agent/structs/structs.go
+++ b/agent/structs/structs.go
@@ -2036,6 +2036,8 @@ func (r *SessionRequest) RequestDatacenter() string {
 type SessionSpecificRequest struct {
        Datacenter string
        SessionID  string
+       // Deprecated v1.7.0.
+       Session string
        EnterpriseMeta
        QueryOptions
 }

@mkeeler
Copy link
Member

mkeeler commented Mar 5, 2020

@jorgemarey I am looking into this. The code you linked does look correct but we must have missed another place where the deprecated field is used. The intention with that commit was to fixup the RPC struct when it makes it to the server. So that users don't run into this exact issue.

@mkeeler mkeeler added this to the 1.7.x milestone Mar 5, 2020
@mkeeler
Copy link
Member

mkeeler commented Mar 5, 2020

@jorgemarey I totally misread your post before. I thought you were linking to a commit with those changes rather than showing the necessary changes to fix the issue.

Look for this to be fixed in v1.7.2

@jorgemarey
Copy link
Contributor Author

Great. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants