-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make server side error messages from vault more clearer #3968
Conversation
@dadgar @schmichael open to better ways to make this even nicer. Even now there are multiple wrapping errors in the description in recent events, but the first thing in the message is that its a server error so that should make it clear that it needs operator intervention on nomad servers to fix. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to fix the IsRecoverable issue but otherwise looks good!
nomad/structs/structs.go
Outdated
|
||
return &RecoverableError{ | ||
Err: e.Error(), | ||
ServerSideErr: false, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this be true?
nomad/structs/structs.go
Outdated
} | ||
|
||
// IsRecoverable returns true if error is a RecoverableError with | ||
// Recoverable=true. Otherwise false is returned. | ||
func IsRecoverable(e error) bool { | ||
if re, ok := e.(Recoverable); ok { | ||
if re, ok := e.(ErrorMeta); ok { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
getter.GetError also implements IsRecoverable
so we shouldn't change this assertion. Maybe keep the Recoverable interface
as is and add:
type ErrorMeta interface {
Recoverable
IsServerSideError
}
...or we could implement IsServerSideError
on getter.GetError
... the meaning there seems a bit ambiguous though. Would a 400 fetching an artifact be a server-side error? I think no because Server
refers to a Nomad Server, but we'd still want to document the interface's exact behavior thoroughly.
nomad/vault.go
Outdated
@@ -429,7 +429,7 @@ OUTER: | |||
retryTimer.Reset(v.config.ConnectionRetryIntv) | |||
v.l.Lock() | |||
v.connEstablished = true | |||
v.connEstablishedErr = fmt.Errorf("Nomad Server failed to establish connections to Vault: %v", err) | |||
v.connEstablishedErr = structs.NewServerSideError(fmt.Errorf("Nomad Server failed to establish connections to Vault: %v", err)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It'd be really nice if Vault didn't reformat API errors into one giant string in the api
package :(
nomad/vendor/github.com/hashicorp/vault/api/response.go
Lines 27 to 67 in 89845b3
func (r *Response) Error() error { | |
// 200 to 399 are okay status codes. 429 is the code for health status of | |
// standby nodes. | |
if (r.StatusCode >= 200 && r.StatusCode < 400) || r.StatusCode == 429 { | |
return nil | |
} | |
// We have an error. Let's copy the body into our own buffer first, | |
// so that if we can't decode JSON, we can at least copy it raw. | |
var bodyBuf bytes.Buffer | |
if _, err := io.Copy(&bodyBuf, r.Body); err != nil { | |
return err | |
} | |
// Decode the error response if we can. Note that we wrap the bodyBuf | |
// in a bytes.Reader here so that the JSON decoder doesn't move the | |
// read pointer for the original buffer. | |
var resp ErrorResponse | |
if err := jsonutil.DecodeJSON(bodyBuf.Bytes(), &resp); err != nil { | |
// Ignore the decoding error and just drop the raw response | |
return fmt.Errorf( | |
"Error making API request.\n\n"+ | |
"URL: %s %s\n"+ | |
"Code: %d. Raw Message:\n\n%s", | |
r.Request.Method, r.Request.URL.String(), | |
r.StatusCode, bodyBuf.String()) | |
} | |
var errBody bytes.Buffer | |
errBody.WriteString(fmt.Sprintf( | |
"Error making API request.\n\n"+ | |
"URL: %s %s\n"+ | |
"Code: %d. Errors:\n\n", | |
r.Request.Method, r.Request.URL.String(), | |
r.StatusCode)) | |
for _, err := range resp.Errors { | |
errBody.WriteString(fmt.Sprintf("* %s", err)) | |
} | |
return fmt.Errorf(errBody.String()) | |
} |
If that just returned fmt.Errorf(bodyBuf.String())
our error messages would be much nicer! No embedded newlines and useless HTTP metadata (url+status).
Ideally Vault would return a custom error type that allowed us to pluck out the bits of metadata we cared about.
No need to block this PR on that, but do you think we should open an issue with them?
37172c7
to
71e4061
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approving with one small change
nomad/node_endpoint.go
Outdated
@@ -1143,7 +1143,14 @@ func (n *Node) DeriveVaultToken(args *structs.DeriveVaultTokenRequest, | |||
if e == nil { | |||
return | |||
} | |||
reply.Error = structs.NewRecoverableError(e, recoverable).(*structs.RecoverableError) | |||
re, ok := e.(structs.Recoverable) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assert to *structs.RecoverableError
here and skip the second assertion below. One less assertion and there are multiple types that implement structs.Recoverable
(even if right now only one of them is used here).
nomad/structs/structs.go
Outdated
@@ -6118,22 +6133,34 @@ func (r *RecoverableError) IsRecoverable() bool { | |||
return r.Recoverable |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix comment
client/client.go
Outdated
@@ -1754,11 +1754,11 @@ func (c *Client) deriveToken(alloc *structs.Allocation, taskNames []string, vcli | |||
var resp structs.DeriveVaultTokenResponse | |||
if err := c.RPC("Node.DeriveVaultToken", &req, &resp); err != nil { | |||
c.logger.Printf("[ERR] client.vault: DeriveVaultToken RPC failed: %v", err) | |||
return nil, fmt.Errorf("DeriveVaultToken RPC failed: %v", err) | |||
return nil, structs.NewWrappedServerError(resp.Error) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should return the err not the resp error
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also should it be marked as a server error? It is likely there is no valid server which is where the error is coming from since all errors other than network are returned via the resp.Error
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right, this should not be marked as such, will fix
client/task_runner.go
Outdated
if structs.IsServerSide(err) { | ||
r.logger.Printf("[ERR] client: failed to derive Vault token for task %v on alloc %q: %v", | ||
r.task.Name, r.alloc.ID, err) | ||
r.Kill("vault", fmt.Sprintf("server error in deriving vault token: %v", err), true) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove in
I'm going to lock this pull request because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active contributions. |
new code
master