-
Notifications
You must be signed in to change notification settings - Fork 156
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add grpc proxy backend #680
Conversation
27aebfc
to
1bcd37a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for starting on this feature- it has been on my wishlist for a while :)
I will try to find time for a thorough review tomorrow, but I have left some initial comments in the meantime (I think your PR is missing utils/backendproxy/ btw).
utils/flags/flags.go
Outdated
&cli.StringFlag{ | ||
Name: "grpc_proxy.key_file", | ||
Value: "", | ||
Usage: "Path to the key used to autheticate with the proxy backend. Enables mTLS.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The key_file
/ cert_file
help texts should each mention that using one requires the other.
6481feb
to
98b2066
Compare
98b2066
to
8e000f6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry it has taken me a while to get back to this.
I think the most important thing to do next is see my comment about investigating more direct gRPC->gRPC backend proxying.
README.md
Outdated
@@ -243,6 +243,19 @@ OPTIONS: | |||
Goroutines to process parallel uploads to backend. (default: 100) | |||
[$BAZEL_REMOTE_NUM_UPLOADERS] | |||
|
|||
--grpc_proxy.url value The base URL to use for a grpc proxy backend. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need an example URL here, so the form is obvious.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, added it
cache/grpcproxy/grpcproxy.go
Outdated
|
||
func New(clients *GrpcClients, | ||
accessLogger cache.Logger, errorLogger cache.Logger, | ||
numUploaders, maxQueuedUploads int) (cache.Proxy, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function only seems to return a nil error, do we need that return value?
On the other hand, maybe this should check the capabilities of the backend server, and possibly return an error from that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed both!
with the proxy backend using mTLS. If this flag is provided, then | ||
grpc_proxy.key_file must also be specified. | ||
[BAZEL_REMOTE_GRPC_PROXY_CERT_FILE] | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How difficult would it be to add support for basic/password authentication? I think we should consider adding this, mostly to avoid documenting that it doesn't work, but also because it's probably easier to setup a system test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The password authentication is right now used only for clients to authenticate with bazel-remote, but not for it to authenticate with the proxy backend. As far as I can tell, none of the existing proxies support this right now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The bazel-remote <-> proxy backend authentiation is completely separate from the client <-> bazel-remote authentication though. If bazel-remote is the main intended use case for the grpc proxy backend, then I think we should support that. But this would be OK to land in a followup (I can help if needed).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did not realise that the http proxy actualy supports basic auth, just need to pass the credentials in the proxy url. I took a stab at doing the same for the grpc proxy.
config/config.go
Outdated
@@ -348,6 +361,17 @@ func validateConfig(c *Config) error { | |||
} | |||
} | |||
|
|||
if c.GRPCBackend != nil { | |||
if c.StorageMode != "uncompressed" { | |||
return errors.New("--grpc_proxy does not yet support compressed storage mode") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lacking support for this mode is unfortunate, it could make the gRPC proxy backend less efficient than HTTP.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, although it depends on whether clients use compression or not. Need to spend some time to figure out how to efficiently get the raw blob (including the header) from a proxied bazel-remote instance instead having to recompress it on the fly and did want to block this feature in the meanwhile. Is compression support a prerequisite for this to merge?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is compression support a prerequisite for this to merge?
It's not a prerequisite, but if we don't support it then I think we would need some documentation and/or example setups that explain the issue- in particular, comparing to using a HTTP bazel-remote backend. (And maybe it's better to just land this feature which works optimally, and skip that documentation?)
Ideally I would like things to work nicely when clients use compression, because that seems to be the best tradeoff for performance (CPU time vs download time on the client side, and minimal CPU usage + effectively larger cache size on the server side).
The best case scenario IMO is when the client, frontend bazel-remote and optional proxy backend server all support compression. The frontend bazel-remote instance needs to verify the sha256 of blobs from the client, but it can then store a compressed result from its disk cache (increasing its effective disk cache size) and serve blobs without re-verifying their hashes.
The presence of a proxy backend increases complexity- should the frontend trust results from the backend? If the answer is yes, then the network transfers from the backend to the client are most efficient (a compressed result can be sent from the backend, via the frontend, to the client without recompression). If the answer is no, then CPU load on the frontend increases, because it needs to verify results from the backend.
Circling back to your question of should this PR land without compression support: would this gRPC backend be better than using proxy backend bazel-remote instance using HTTP with compression support? (Or do you have a different setup than I am imagining, is the proxy backend something other than bazel-remote? Could you to describe your use-case for this feature?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mostynb Sorry it took me this long to get back on this one. The setup I have in mind is using another bazel-remote as backend, but without compression. From my experience, enabling compression on both client and server (and proxy?) was never better: the overhead from compressing and decompressing each blob on the client side was too much, and the network transfers fast enough to effectively seeing slower builds with compression enabled. Enabling only on server side is also a possibility, which might reduce used storage and increase the effective cache size, but is something that I haven't tried as I suspect it would put too much additional load on the server for too little gain. I guess it all depends on the available resources (Is network fast? Are there a lot of cpus?) and the build profile (how "compressable" are the artifacts?).
Regardless, I've added support for zstd compression, but with a caveat: it does not work with a bazel-remote backend. The reason is that bazel-remote itself expects the backends to return the blobs just like they were pushed by the server (it is actually a bit more subtle, it expects all blobs to start with the casblob header, the content of it can technically change, e.g. use different frame sizes). However bazel remote would always serve the blob after stripping away the header, so it is not behaving as a valid proxy backend. This is true regardless of the protocol used, as of now bazel-remote cannot be used as a http backend for another bazel-remote instace with compression enabled (http additionally suffers from #524).
I though of a few ways to solve this but none of them was great:
- Have the grpc (and http?) proxy implementation to decompress and recompress all blobs coming from the backend and add the header, which would probably require writing to file and read back. Overall seems just too much overhead to be an acceptable solution.
- Instead of decompressing and recompressing, attempt to recreate the header directly from the compressed stream. The header is mostly used to quickly find the uncompressed offset in the compressed stream, which could be done by decoding each frame and accumulating the Frame_Content_Size field of each frame header (provided it is present). This would still require to consume the entire stream and read it back from a temporary storage, but would save recompressing, which is usually the most expensive part.
- Somewhat inspired by the above, drop the header all together and find offsets on-request by parsing the frame headers. Not sure if there is something that is currently done that wouldn't be possible this way.
- Have bazel-remote stream back the header as well, since they are valid zstd (skippable) frames. For requests where the offeset is != 0, the header should be updated to match the offset of the truncated stream, but that might be fast enough and a rear enough case. The downside is that each request will stream additional bytes, which might be a significant relative increase for small blobs. Maybe this should be put behind a flag so that it can be enabled conditionally for bazel-remote instance that are supposed to be used as backends.
Curious to know if the above makes sense to you or I have completely misread how this works or if you have a better solution in mind. Unless I am missing something obvious, this would need some thinking, and maybe out of scope for this feature request since it would still have feature parity with http?
cache/grpcproxy/grpcproxy.go
Outdated
case cache.CAS: | ||
// We don't know the size, so send a FetchBlob request first to get the digest | ||
// TODO: consider passign the expected blob size to the proxy? | ||
decoded, err := hex.DecodeString(hash) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why decode the string, just to encode it again below?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First we decode it into a byte array, then we encode it into a base64 string, just doing the opposite of what it's done in https://github.com/buchgr/bazel-remote/blob/master/server/grpc_asset.go#L74-L81
cache/grpcproxy/grpcproxy.go
Outdated
} | ||
return true, size | ||
case cache.CAS: | ||
decoded, err := hex.DecodeString(hash) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why decode the string, just to encode it again below?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above
cache/grpcproxy/grpcproxy.go
Outdated
Name: "checksum.sri", | ||
Value: fmt.Sprintf("sha256-%s", base64.StdEncoding.EncodeToString(decoded)), | ||
} | ||
freq := asset.FetchBlobRequest{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find this a bit confusing, why is this using the remote asset API instead of BatchReadBlobs or the bytestream API?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to know the size of the blob first. The alternative is passing it as parameter to this function.
cache/grpcproxy/grpcproxy.go
Outdated
Qualifiers: []*asset.Qualifier{&q}, | ||
} | ||
|
||
res, err := r.clients.asset.FetchBlob(ctx, &freq) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why use the remote asset API instead of FindMissingBlobs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above
49c6921
to
21fb101
Compare
21fb101
to
22150be
Compare
22150be
to
6498cbe
Compare
README.md
Outdated
--grpc_proxy.url value The base URL to use for a grpc proxy backend, e.g. | ||
localhost:9090 or example.com:7070. | ||
[$BAZEL_REMOTE_GRPC_PROXY_URL] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should probably provide a way to use TLS when talking to the proxy backend and not using mTLS. It looks like this PR uses TLS when authenticating with the proxy backend via mTLS (which makes sense), otherwise TLS is not used at all when talking to the proxy backend?
69e7284
to
054cfca
Compare
fdffcd5
to
cf04071
Compare
@@ -123,6 +124,23 @@ func getLogin(ctx context.Context) (username, password string, err error) { | |||
|
|||
return username, password, nil | |||
} | |||
|
|||
if k == "authorization" && len(v) > 0 && strings.HasPrefix(v[0], "Basic ") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried to get it working with the :authority
header like the above, but couldn't:
- Passing
<user>:<pass>@address
directly togrpc.Dial
fails withtoo many colons in address
- Setting the header explicitly as metadata is simply ovveridden
- Using
grpc.WithAuthority
seems to work, but fails withcertificate is valid for localhost, not <user>:<pass>@localhost
when TLS is enabled.
Supporting authorization
as a header wouldn't be too bad anyway, it's the recommended way to do basic auth
with the proxy backend using mTLS. If this flag is provided, then | ||
grpc_proxy.key_file must also be specified. | ||
[BAZEL_REMOTE_GRPC_PROXY_CERT_FILE] | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did not realise that the http proxy actualy supports basic auth, just need to pass the credentials in the proxy url. I took a stab at doing the same for the grpc proxy.
aa6bb9d
to
303a252
Compare
Hi @mostynb, any blocker on merging this? |
HI, sorry this is taking me a while. I wanted to make a release before landing this, because I will probably want to make a few changes afterwards. That is done now (after a small hiccup in 2.4.2 and a rushed out 2.4.3 release). I will try to find time to look at this again tomorrow- thanks for persevering. |
@mostynb any update on this one? |
I finally got back to reviewing this, sorry for the delay. I will continue tomorrow. |
I landed a squashed version of this- I think it will be easier to iterate on this on master with a few smaller PRs than to drag this one on any longer. Thanks again for being so patient. |
This adds a proxy backend that uses the grpc protocol. When the size of the blobs are unknown (i.e. when using the bazel-remote is used with the REST api), the proxy relies on the asset api to get the blob size, so in order to use another bazel-remote instance as backend
--experimental_remote_asset_api
must be provided to the proxy.Tested mostly with the added unit test and some local testing done building bazel-remote itself. These are the step used to test locally (each command executed in a different terminal)