Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hyper Curl test 580 and 580 Root cause #2780

Closed
liamwarfield opened this issue Mar 17, 2022 · 8 comments · Fixed by #2798
Closed

Hyper Curl test 580 and 580 Root cause #2780

liamwarfield opened this issue Mar 17, 2022 · 8 comments · Fixed by #2798
Labels
A-ffi Area: ffi (C API) C-bug Category: bug. Something is wrong. This is bad!

Comments

@liamwarfield
Copy link
Contributor

liamwarfield commented Mar 17, 2022

Version
hyper::ffi current version

Platform
All?

Description
Found the RC of failing curl tests 580 and 581!

[short summary of the bug]
hyper_headers_foreach iterates through headers in by header name, not by order recieved! The smoking gun for for curl tests 580 and 581 is this line.

For test 581:
Raw response sent by server:

HTTP/1.1 200 OK
Content-Length: 18

WE ROOLZ: 118369
HTTP/1.1 200 all good!
Date: Tue, 09 Nov 2010 14:49:00 GMT
Server: test-server/fake
Content-Type: text/html
Content-Length: 0
Connection: close
Content-Type: changed/my/mind

Curl is expecting an iterator that returns headers in the order that they were received:

HTTP/1.1 200 all good![CR][LF]
Date: Tue, 09 Nov 2010 14:49:00 GMT[CR][LF]
Server: test-server/fake[CR][LF]
Content-Type: text/html[CR][LF]
Content-Length: 0[CR][LF]
Connection: close[CR][LF]
Content-Type: changed/my/mind[CR][LF]
[CR][LF]

but hyper is returning ALL of the Content-Type headers before moving onto the Connection header. Results in:

HTTP/1.1 200 all good![CR][LF]
Date: Tue, 09 Nov 2010 14:49:00 GMT[CR][LF]
Server: test-server/fake[CR][LF]
Content-Type: text/html[CR][LF]
Content-Type: changed/my/mind[CR][LF]  <------ comes to early!
Content-Length: 0[CR][LF]
Connection: close[CR][LF]
[CR][LF]

Hyper Code Responcible

hyper_headers_foreach_callback, userdata: *mut c_void) {
        let headers = non_null!(&*headers ?= ());
        for name in headers.headers.keys() {
            let mut names = headers.orig_casing.get_all(name);

            /// HERE is the bug, this iterates through all headers with the
            /// same name before moving on to the next header name.
            for value in headers.headers.get_all(name) {
                let (name_ptr, name_len) = if let Some(orig_name) = names.next() {
                    (orig_name.as_ref().as_ptr(), orig_name.as_ref().len())
                } else {
                    (
                        name.as_str().as_bytes().as_ptr(),
                        name.as_str().as_bytes().len(),
                    )
                };

                let val_ptr = value.as_bytes().as_ptr();
                let val_len = value.as_bytes().len();

                if HYPER_ITER_CONTINUE != func(userdata, name_ptr, name_len, val_ptr, val_len) {
                    return;
                }
            }
        }
    }
}

@seanmonstar

@liamwarfield liamwarfield added the C-bug Category: bug. Something is wrong. This is bad! label Mar 17, 2022
@seanmonstar seanmonstar added the A-ffi Area: ffi (C API) label Mar 17, 2022
@seanmonstar
Copy link
Member

Ah yes, I remember this possibly being a thing. Thanks for the detailed report!

@liamwarfield
Copy link
Contributor Author

Any ideas for where the fix should live:
hyper?
change the tests?

@seanmonstar
Copy link
Member

I found that we did discuss this some in #2572. I don't think we'd update http::HeaderMap to include info on insertion order of duplicate values, since it'd be increasing the cost for a lot of users who don't need it. I see two other solutions and I'm not sure which is preferable:

  • Include the extra data specifically in hyper_headers, which would probably mean extending the existing HeaderCaseMap to include a memory of the order, or to make a separate extra vec/map that has the order.
  • Change the test in curl from temporarily DISABLED if hyper, to be always disabled (so, moved out of the DISABLED file and have the condition in the data/test580 instead) and tell users that is one of the differences. I'd like to believe that 99.9% of users won't care.

@liamwarfield
Copy link
Contributor Author

Adding an extra vec to HeaderCaseMap feels better than just disabling the test for hyper. I'll scrape together a PR for that.

@liamwarfield
Copy link
Contributor Author

liamwarfield commented Mar 21, 2022

Update from the curl side, the maintainers of libcurl would prefer a fix on the hyper side over changing the tests for this one.

@liamwarfield
Copy link
Contributor Author

I've been working changing HeaderCaseMap used in the ffi hyper_headers to a Vec<(Bytes, Bytes) to just store the original version of the headers. This change is giving out headers in the right order, but the header names are not correctly capitalized! @seanmonstar do you have any idea where that capitalization happens?

On another note, do you know if the robinhood hashing in http::headers::HeaderMap preserves ordering within a group of headers with the same name? If that's the case, then I could store order info a Vec<HeaderName, Idx>.

@seanmonstar
Copy link
Member

Sorry for the slow response, I've been thinking about this and not been able to think of a good solution (and so haven't had something to say).

  • The http::HeaderMap is optimized for faster lookup, assuming that most in the Rust API want to get a specific name. We do that by making use of "case doesn't matter" to just always treat things lowercase (removes case-insensitive searching which is slower), and extra values don't store an additional full entry. The order of values for a specific name are preserved, since the spec points out that is required, but otherwise hashing can alter the order of different names.
  • We originally made the HeaderCaseMap as something that curl could use to preserve the casing of the names. There have been desires for such a thing in the Rust API, and in fact some tiny portions of it are already exposed and depended on. So, we can't really change its behavior...
  • It feels bad to have a 3rd collection that just stores the full order of the values.

I still don't know what the best answer here is.

@liamwarfield
Copy link
Contributor Author

Thanks for the update ("The order of values for a specific name are preserved" was very helpful) . I've got a branch with 580 and 581 passing now, I'll post a PR.

liamwarfield added a commit to liamwarfield/hyper that referenced this issue Mar 30, 2022
Libcurl expects that headers are iterated in the same order that they
are recieved. Previously this caused curl tests 580 and 581 to fail.

SUMMARY OF CHANGES: Add a new data structure to HeaderCaseMap that
represents the order in which headers originally appear in a HTTP
message. This datastructure is `Vec<(Headername, multimap-index)>`.
This vector is ordered by the order which headers were recieved.
This new datastucture is used in hyper_headers_foreach to iterate
through headers in the correct order.

CLOSES ISSUE: hyperium#2780
BEAKING CHANGE: HeaderCaseMap::append now requires that names passed
into it impl Into<HeaderName> + Clone. Currently all types that impl
IntoHeaderName (in hyper and http) also impl these traits already.
liamwarfield added a commit to liamwarfield/hyper that referenced this issue Apr 11, 2022
Libcurl expects that headers are iterated in the same order that they
are recieved. Previously this caused curl tests 580 and 581 to fail.
This necessitated exposing a way to preserve the original ordering of
http headers.

SUMMARY OF CHANGES: Add a new data structure called OriginalHeaderOrder that
represents the order in which headers originally appear in a HTTP
message. This datastructure is `Vec<(Headername, multimap-index)>`.
This vector is ordered by the order which headers were recieved.
Add the following ffi functions:
- ffi::client::hyper_clientconn_options_set_preserve_header_order : An
     ffi interface to configure a connection to preserve header order.
- ffi::client::hyper_clientconn_options_set_preserve_header_case : An
     ffi interface to configure a connection to preserve header case.
- ffi::http_types::hyper_headers_foreach_ordered : Iterates the headers in
     the order the were recieved, passing each name and value pair to the callback.
Add a new option to ParseContext, and Conn::State called `preserve_header_order`.
This option, and all the code paths it creates are behind the `ffi`
feature flag. This should not change performance of response parsing for
non-ffi users.

CLOSES ISSUE: hyperium#2780
@seanmonstar seanmonstar moved this to Todo in hyper in curl Apr 28, 2022
@seanmonstar seanmonstar moved this from Todo to Done in hyper in curl Apr 28, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-ffi Area: ffi (C API) C-bug Category: bug. Something is wrong. This is bad!
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

2 participants