-
Notifications
You must be signed in to change notification settings - Fork 239
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
G1: Add proposal for true binary metadata #19
Conversation
|
||
When transmitting metadata on a connection where the peer has specified | ||
GRPC_ALLOW_TRUE_BINARY_METADATA, instead of encoding using base64, an | ||
implementation MAY instead prefix a NUL byte to the metadata and transmit the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/metadata/metadata value/. Or maybe "header field value".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
G1-true-binary-metadata.md
Outdated
|
||
Since this is a HTTP2 extension and other extensions might alias this extension | ||
id, it's possible that this becomes misconfigured. In that case, peers are | ||
required to RST_STREAM with http error PROTOCOL_ERROR. If a binary encoding was |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe call out that this is the normal behavior as specified in RFC7540§10.3?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
G1-true-binary-metadata.md
Outdated
Suppose we wanted to send metadata element 'foo-bin: 0x01' (ie a single byte | ||
containing '1'). | ||
|
||
Under base64, we'd send a http header 'foo-bin: AQ' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add period at end of sentence, or an additional newline.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
G1-true-binary-metadata.md
Outdated
@@ -0,0 +1,89 @@ | |||
Title |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Put in the real title.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
G1-true-binary-metadata.md
Outdated
GRPC_ALLOW_TRUE_BINARY_METADATA = 0xfe03. | ||
|
||
This setting is randomly chosen (to avoid conflicts with other extensions), and | ||
within the experimental range of HTTP extensions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggest adding a link to https://tools.ietf.org/html/rfc7540#section-11.3 here to refer readers to information about the experimental range.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
### New setting | ||
|
||
Expose a custom setting in our HTTP2 settings exchange: | ||
GRPC_ALLOW_TRUE_BINARY_METADATA = 0xfe03. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't seem gRPC-specific enough to justify GRPC_
prefix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That it only applies to -bin headers (which are a gRPC thing) makes it gRPC-specific enough (IMO)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why only -bin
headers? If we're using 0x00
magic prefix, then I don't see a reason for this limitation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So gRPC uses -bin to specify that a header is binary, and says that other metadata must be ascii (per http2). That -bin has semantic meaning (foo-bin is a different header than foo).
We use that as a signal to base64 encode things right now. Under this proposal we'd just switch to an alternate encoding.
Since we can't universally make all of the HTTP2 ecosystem take binary metadata, a mechanism to allow transmitting anything and dropping the -bin isn't possible for us (we need to keep interoperating - so proxies that support this would potentially need to do the base64 conversion if one side didn't support this mechanism).
If there were an extension that allowed sending binary values in metadata, we'd still end up needing a disambiguating byte for backwards compatibility (for the gRPC -bin suffixed headers), and we'd still maintain the requirement that non -bin headers must be ascii.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you aware of any HTTP/2 proxies that would reject requests because of "true binary" encoding of the header field? Unless it's a well-known header that must be processed by the proxy, most (all?) will just pass-through headers without ever looking at the content of the header fields.
Don't get me wrong, I'm all in for allowing binary header fields, but in order get this accepted into HTTP/2 ecosystem, this proposal should be a little more than gRPC implementation detail, that's why -bin
suffix feels like an artificial limitation.
I believe that standardizing on "0x00
prefix of the header field indicates binary/raw representation" is much more reasonable, and doesn't prevent gRPC from only using it in headers with -bin
suffix.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To answer this myself, RFC7540, sec. 10.3. Intermediary Encapsulation Attacks says:
Similarly, HTTP/2 allows header field values that are not valid.
While most of the values that can be encoded will not alter header
field parsing, carriage return (CR, ASCII 0xd), line feed (LF, ASCII
0xa), and the zero character (NUL, ASCII 0x0) might be exploited by
an attacker if they are translated verbatim. Any request or response
that contains a character not permitted in a header field value MUST
be treated as malformed (Section 8.1.2.6). Valid characters are
defined by the "field-content" ABNF rule in Section 3.2 of [RFC7230].
I also did some digging on the http-wg mailing list and it seems that sending binary blobs in header fields was discussed during HTTP/2 standardization, but they eventually decided against it, because apparently base64 in HPACK is "good enough".
containing '1'). | ||
|
||
Under base64, we'd send a http header 'foo-bin: AQ'. | ||
Under binary, we'd send 'foo-bin: 0x00 0x01' (ie prefixing a NUL byte and then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need NUL
-byte here? Why can't we send binary header as foo-bin: 0x01
?
I actually spent some time on Wednesday looking into this (for different reasons), and I couldn't find anything in the spec that forbids binary header values.
This was a limitation in SPDY, which used NUL
-byte as a separator between multiple header values, but it seems that this "feature" was removed from HPACK.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://tools.ietf.org/html/rfc7540#section-8.1 says that https://tools.ietf.org/html/rfc7230#section-3.2 is normative, which does not allow binary headers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
...which means that foo-bin: 0x00 0x01
is also illegal.
My point is that nothing in HPACK prevents it from transmitting binary data, so we could add generic SETTINGS
parameter for it (for example: SETTING_ALLOW_BINARY_HEADER_FIELD
) and carry it over HPACK as any other value.
But I guess 0x00
prefix is here to differentiate binary from text? If so, do we really need -bin
prefix in header name?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-bin is not going away so long as there are clients, servers, and intermediaries that do not support binary headers.
containing '1'). | ||
|
||
Under base64, we'd send a http header 'foo-bin: AQ'. | ||
Under binary, we'd send 'foo-bin: 0x00 0x01' (ie prefixing a NUL byte and then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
...which means that foo-bin: 0x00 0x01
is also illegal.
My point is that nothing in HPACK prevents it from transmitting binary data, so we could add generic SETTINGS
parameter for it (for example: SETTING_ALLOW_BINARY_HEADER_FIELD
) and carry it over HPACK as any other value.
But I guess 0x00
prefix is here to differentiate binary from text? If so, do we really need -bin
prefix in header name?
### New setting | ||
|
||
Expose a custom setting in our HTTP2 settings exchange: | ||
GRPC_ALLOW_TRUE_BINARY_METADATA = 0xfe03. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why only -bin
headers? If we're using 0x00
magic prefix, then I don't see a reason for this limitation.
peers MAY use a 'true binary' encoding (described below), instead of the current | ||
base64 encoding for -bin metadata. | ||
|
||
Implementations SHOULD transmit this setting only once, and as part of the first |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since SETTINGS
apply only to a connection and not end-to-end, what about intermediaries? Are proxies supposed to re-encode header field to base64 if they received binary header field but are talking to a backend that doesn't support it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's right.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
... or they can just not advertise support for this feature.
Per PROTOCOL-HTTP2.md:
However, this proposal doesn't cover this case, and there is no way for implementations to detect whether |
On receipt, we transparently reverse the transformation. | ||
|
||
This transformation is costly in terms of CPU, and there exist use cases where | ||
this transformation can become the CPU bottleneck for gRPC. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add some data here (i.e. metadata to payload ratio, metadata size, how much CPU does base64 encoding, base64 decoding, Huffman encoding and Huffman decoding each take)? It makes a lot of difference whether we're talking about 2% or 50% of total CPU processing time
@ctiller, let's prohibits concatenating raw-binary-encoded headers that is normally possible per RFC7230§3.2.2? |
That SGTM. |
Should we merge this? It's already implemented in C-core, and there's been more than enough time for discussion. |
Please direct discussion here: https://groups.google.com/forum/#!topic/grpc-io/Ww8waYz_Nes