Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update to oghttp2=true causes cookie-related failures #32611

Closed
ravenblackx opened this issue Feb 27, 2024 · 17 comments
Closed

Update to oghttp2=true causes cookie-related failures #32611

ravenblackx opened this issue Feb 27, 2024 · 17 comments
Assignees
Labels
area/codec_compatibility Known differences between codecs area/http bug stale stalebot believes this issue/PR has not been touched recently

Comments

@ravenblackx
Copy link
Contributor

With updating from v1.28.0 to v1.29.1 came runtime flag envoy.reloadable_features.http2_use_oghttp2 defaulting to true.

This caused failures in some of our integration tests, which, after several days of trying to track down what the problem was, I finally isolated to the fact that while envoy was still reporting in its debug log router decoding headers message "cookie: a; b; c", the headers at the upstream were getting interpreted as "cookie: c".

I think that what's probably happening is the codec is restructuring the headers as

cookie: a
cookie: b
cookie: c

and some library in the upstream is (in violation of http spec) interpreting this as three "set" operations that each overwrite the previous cookie value in a dictionary/map type structure.

Flipping the flag back to false fixes the immediate problem; I'm not sure if this is an issue that should be addressed in envoy, or if it's appropriate in these circumstances for envoy to say "the spec says it's okay to restructure that kind of header so the problem is at the upstream."

Note: I'm also not certain this is what's happening, but it's an interpretation that fits the data I have; that the upstream sees only the last cookie, and that envoy's router filter debug logs all the cookies, and that flipping this flag changes that. From this data it could be that the codec is where the lost-cookie flattening is happening, in which case that's a much bigger bug. :)

@ravenblackx ravenblackx added bug area/http triage Issue requires triage area/codec_compatibility Known differences between codecs labels Feb 27, 2024
@ravenblackx
Copy link
Contributor Author

@birenroy @diannahu

@ravenblackx ravenblackx removed the triage Issue requires triage label Feb 27, 2024
@ravenblackx
Copy link
Contributor Author

Another possibility that occurs to me that would also fit the data is that maybe the downstream is sending

cookie: a
cookie: b
cookie: c

and the other codec is restructuring that to a single semicolon-separated header.

@yanavlasov
Copy link
Contributor

@ravenblackx do you have repro case?

@birenroy
Copy link
Contributor

I've confirmed that oghttp2 will crumble Cookie headers and deliver them to the application as individual (field name, field value) pairs. From rereading RFC 6265, I can see how this could be unexpected by applications:

https://www.rfc-editor.org/rfc/rfc6265#section-5.4

When the user agent generates an HTTP request, the user agent MUST
   NOT attach more than one Cookie header field.

I'll get a fix in shortly.

@birenroy
Copy link
Contributor

/assign @birenroy

@birenroy
Copy link
Contributor

Also relevant: RFC 9113 Section 8.2.3.

To allow for better compression efficiency, the Cookie header field MAY be split into separate header
fields, each with one or more cookie-pairs. If there are multiple Cookie header fields after
decompression, these MUST be concatenated into a single octet string using the two-octet delimiter
of 0x3b, 0x20 (the ASCII string "; ") before being passed into a non-HTTP/2 context, such as an
HTTP/1.1 connection, or a generic HTTP server application.

@birenroy
Copy link
Contributor

Oh, @ravenblackx , could you clarify what protocols are being spoken on the various hops?

If HTTP/2 is being used between the Envoy and the upstream, then the crumbled version of the cookie should be a valid representation. If it's HTTP/1, then multiple Cookie headers would not be valid.

It's relatively straightforward to consolidate Cookie values received by the codec. It will be a more invasive change to remove the crumbling behavior when serializing HTTP/2 HEADERS frames.

@ravenblackx
Copy link
Contributor Author

I'm not 100% sure, it's a pretty tangled integration test, but I think we typically use HTTP/1.1 for all upstreams and only support HTTP/2 at the edge.

It looks like in this case the whole thing end to end is probably HTTP/1.1, because the access log entry includes POST /account/get_connected_devices HTTP/1.1

@birenroy
Copy link
Contributor

HTTP/2 has to be in the mix somehow, or flipping the oghttp2 reloadable feature would not have any effect. Let me see what I can do.

@ravenblackx
Copy link
Contributor Author

Moderate chance then that it's a misconfiguration in the integration test making the initial request http/1.1 and "upgrading" it, the opposite of what we do in production.

@sundarms
Copy link

sundarms commented Mar 1, 2024

I am also seeing the same behavior
Client send single cookie header with space (http1.1) -> Envoy Lua filter makes call to another service (http2) -> Other service receives multiple cookie header.

copybara-service bot pushed a commit to google/quiche that referenced this issue Mar 6, 2024
This is preparation for fixing the issue described in envoyproxy/envoy#32611.

Protected by new code path, not yet enabled in the GFE binary.

PiperOrigin-RevId: 613330788
copybara-service bot pushed a commit to google/quiche that referenced this issue Mar 7, 2024
@birenroy
Copy link
Contributor

birenroy commented Mar 7, 2024

Once Envoy updates to a version of QUICHE that includes google/quiche@82ff95e, I believe this issue will be resolved.

RyanTheOptimist added a commit that referenced this issue Mar 7, 2024
…2 to false (#32751)

http2: Switch the value of envoy.reloadable_features.http2_use_oghttp2 to false
A number of users have reported issues with oghttp2 including #32611 and #32401
so reverting this flag seems wise.

Signed-off-by: Ryan Hamilton <[email protected]>
phlax pushed a commit to phlax/envoy that referenced this issue Mar 7, 2024
…2 to false (envoyproxy#32751)

http2: Switch the value of envoy.reloadable_features.http2_use_oghttp2 to false
A number of users have reported issues with oghttp2 including envoyproxy#32611 and envoyproxy#32401
so reverting this flag seems wise.

Signed-off-by: Ryan Hamilton <[email protected]>
phlax pushed a commit that referenced this issue Mar 7, 2024
…2 to false (#32751)

http2: Switch the value of envoy.reloadable_features.http2_use_oghttp2 to false
A number of users have reported issues with oghttp2 including #32611 and #32401
so reverting this flag seems wise.

Signed-off-by: Ryan Hamilton <[email protected]>
phlax added a commit to phlax/envoy that referenced this issue Mar 7, 2024
**Summary of changes**:

- Revert default codec to nghttp2 as a number of users have reported issues with
  oghttp2 including envoyproxy#32611 and envoyproxy#32401
- Assorted minor fixes

**Docker images**:
    https://hub.docker.com/r/envoyproxy/envoy/tags?page=1&name=v1.29.2
**Docs**:
    https://www.envoyproxy.io/docs/envoy/v1.29.2/
**Release notes**:
    https://www.envoyproxy.io/docs/envoy/v1.29.2/version_history/v1.29/v1.29.2
**Full changelog**:
    envoyproxy/envoy@v1.29.1...v1.29.2

Signed-off-by: Ryan Northey <[email protected]>
phlax added a commit that referenced this issue Mar 7, 2024
**Summary of changes**:

- Revert default codec to nghttp2 as a number of users have reported issues with
  oghttp2 including #32611 and #32401
- Assorted minor fixes

**Docker images**:
    https://hub.docker.com/r/envoyproxy/envoy/tags?page=1&name=v1.29.2
**Docs**:
    https://www.envoyproxy.io/docs/envoy/v1.29.2/
**Release notes**:
    https://www.envoyproxy.io/docs/envoy/v1.29.2/version_history/v1.29/v1.29.2
**Full changelog**:
    v1.29.1...v1.29.2

Signed-off-by: Ryan Northey <[email protected]>
@juanmolle
Copy link
Contributor

I'm sure if could be related. but I'm seeing incorrect behaviour in external processing filter too in 1.29.1 and 1.29.2
while in 1.25.11 in the external processing server using go sdk

&{headers:{headers:{key:":authority" value:"localhost:8000"} headers:{key:":path" value:"/headers"} headers:{key:":method" value:"GET"} headers:{key:":scheme" value:"http"} headers:{key:"user-agent" value:"curl/8.2.1"} headers:{key:"accept" value:"/"} headers:{key:"x-forwarded-proto" value:"http"} headers:{key:"x-request-id" value:"1a5ffd66-3dc3-4186-9e3e-2072dbf2d558"}}

in 1.29.1 I'm getting

pb.ProcessingRequest_RequestHeaders &{headers:{headers:{key:":authority" 3:"localhost:8000"} headers:{key:":path" 3:"/headers"} headers:{key:":method" 3:"GET"} headers:{key:":scheme" 3:"http"} headers:{key:"user-agent" 3:"curl/8.2.1"} headers:{key:"accept" 3:"/"} headers:{key:"x-forwarded-proto" 3:"http"} headers:{key:"x-request-id" 3:"dd5ef52f-249a-4191-a6b6-e3f54b8ac8cd"}}

could it be related to http2? or grpc library changed?

jbohanon added a commit to solo-io/envoy-fork that referenced this issue Mar 8, 2024
* build(deps): bump distroless/base-nossl-debian12 from `51ab103` to `49edf70` in /ci (envoyproxy#32348)

build(deps): bump distroless/base-nossl-debian12 in /ci

Bumps distroless/base-nossl-debian12 from `51ab103` to `49edf70`.

---
updated-dependencies:
- dependency-name: distroless/base-nossl-debian12
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Signed-off-by: Ryan Northey <[email protected]>

* build(deps): bump distroless/base-nossl-debian12 from `49edf70` to `0e777c6` in /ci (envoyproxy#32576)

build(deps): bump distroless/base-nossl-debian12 in /ci

Bumps distroless/base-nossl-debian12 from `49edf70` to `0e777c6`.

---
updated-dependencies:
- dependency-name: distroless/base-nossl-debian12
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Signed-off-by: Ryan Northey <[email protected]>

* populate histogram summary sample sum

Signed-off-by: Sebastian Schepens <[email protected]>

* http2: Switch the value of envoy.reloadable_features.http2_use_oghttp2 to false (envoyproxy#32751)

http2: Switch the value of envoy.reloadable_features.http2_use_oghttp2 to false
A number of users have reported issues with oghttp2 including envoyproxy#32611 and envoyproxy#32401
so reverting this flag seems wise.

Signed-off-by: Ryan Hamilton <[email protected]>

* repo: Release v1.29.2

**Summary of changes**:

- Revert default codec to nghttp2 as a number of users have reported issues with
  oghttp2 including envoyproxy#32611 and envoyproxy#32401
- Assorted minor fixes

**Docker images**:
    https://hub.docker.com/r/envoyproxy/envoy/tags?page=1&name=v1.29.2
**Docs**:
    https://www.envoyproxy.io/docs/envoy/v1.29.2/
**Release notes**:
    https://www.envoyproxy.io/docs/envoy/v1.29.2/version_history/v1.29/v1.29.2
**Full changelog**:
    envoyproxy/envoy@v1.29.1...v1.29.2

Signed-off-by: Ryan Northey <[email protected]>

---------

Signed-off-by: dependabot[bot] <[email protected]>
Signed-off-by: Ryan Northey <[email protected]>
Signed-off-by: Sebastian Schepens <[email protected]>
Signed-off-by: Ryan Hamilton <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Sebastian Schepens <[email protected]>
Co-authored-by: Ryan Hamilton <[email protected]>
Co-authored-by: Ryan Northey <[email protected]>
@birenroy
Copy link
Contributor

I'm sure if could be related. but I'm seeing incorrect behaviour in external processing filter too in 1.29.1 and 1.29.2 while in 1.25.11 in the external processing server using go sdk

Can you clarify what incorrect behavior you're seeing, @juanmolle ? It's not clear to me from the snippets you included.

@birenroy
Copy link
Contributor

birenroy commented Apr 1, 2024

FWIW, I believe this issue was fixed as of #32874, which included google/quiche@82ff95e.

Copy link

github-actions bot commented May 1, 2024

This issue has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in the next 7 days unless it is tagged "help wanted" or "no stalebot" or other activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the stale stalebot believes this issue/PR has not been touched recently label May 1, 2024
Copy link

github-actions bot commented May 8, 2024

This issue has been automatically closed because it has not had activity in the last 37 days. If this issue is still valid, please ping a maintainer and ask them to label it as "help wanted" or "no stalebot". Thank you for your contributions.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale May 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/codec_compatibility Known differences between codecs area/http bug stale stalebot believes this issue/PR has not been touched recently
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants