-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for Google Cloud Spanner #3977
Conversation
physical/spanner/spanner.go
Outdated
if haEnabledStr == "" { | ||
haEnabledStr = c["ha_enabled"] | ||
} | ||
haEnabled := true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please default to false. We don't test this and we've had a lot of people with other HA backends that end up having some kind of problem. This isn't to say that this backend will, only that we would prefer people need to opt-in, at least in the near term.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
physical/spanner/spanner.go
Outdated
var err error | ||
haEnabled, err = strconv.ParseBool(haEnabledStr) | ||
if err != nil { | ||
return nil, errors.Wrap(err, "failed to parse HA enabled") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason not to use errwrap instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find it more verbose than necessary, and I routinely forget to put the {{err}}
, which makes future debugging annoying. Nonetheless, I'll conform and update. I've just personally gravitated to pkg/errors more 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Understandable; at this point we're just trying to get the whole Vault codebase to use something and since errwrap is the HC way...
I might look into pkg/errors at some point as a replacement, but first I want to get everything sane with one type of wrapping.
// indicating if we are stopped - it exists to prevent double closing the | ||
// channel. stopLock is a mutex around the locks. | ||
stopCh chan struct{} | ||
stopped bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than a bool and lock you may want to just use atomic.CompareAndSwap
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm - with channels? Sorry - I've never used CompareAndSwap, so I'm not sure what you're suggesting here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In place of using stopped
/stopLock
. You can have a uint32 that defaults to zero; when stopping you atomic.CompareAndSwap and if it was swapped, you continue on, and if not, you're already running stop logic. Then in Lock you do the opposite (compare to 1 and swap to 0).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't a must btw, just a suggestion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gotcha. I would prefer to leave it as-is, if that's okay with you.
} | ||
}() | ||
|
||
_, err := l.backend.client.ReadWriteTransaction(ctx, func(ctx context.Context, txn *spanner.ReadWriteTransaction) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is pretty cool; I'm assuming that the entire block will happen atomically so you can't get two clients elevating themselves.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct. The function within the function may be called multiple times, but the transaction is retained during retries. It's a global lock basically.
- **High Availability** – the Google Cloud Spanner storage backend supports high | ||
availability. Because the Google Cloud Spanner storage backend uses the system | ||
time on the Vault node to acquire sessions, clock skew across Vault servers | ||
can cause lock contention. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is a user supposed to do if there is lock contention?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cry
Just kidding. Basically they need to resolve the skew using something like ntp and restarting services. None of the other backends documented how to fix skew, and, since it's not specific to Spanner, I don't know if it belongs here. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most (all?) other HA backends aren't time-specific, they're session-specific, so skew isn't really an issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DynamoDB is 😄 , and they did not document the fix
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think there is a way to do an exclusive lock-for-writing on a document. I don't know what the failure conditions are. It's possible they just don't account for clock skew at all.
|
||
- `ha_enabled` `(string: "true")` - Specifies if high availability mode is | ||
enabled. This is a boolean value, but it is specified as a string like "true" | ||
or "false". |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please add what the default is (I think in another review it looks like the default will be "false")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That "true" there is actually the default, although it's now wrong since I asked for it to be flipped :-)
This PR adds support and documentation for a Google Cloud Spanner storage (physical) backend. The backend supports both HA and Transactional interfaces.
As requested, HA is not enabled by default. Uses can opt-in to HA by setting
ha_enabled = "true"
in configuration.