Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Overhaul fetch implementation and host API #52

Merged
merged 31 commits into from
Jun 21, 2024
Merged

Conversation

tschneidereit
Copy link
Member

This PR contains a substantial overhaul of the fetch implementation, along with meaningful changes to how the host API works

On the latter, the most important change is that the representation of handles is now completely opaque to consumers: instead of using a typedef visible to users of the host_api.h header, the header just forward-declares a class, which hence can't be created or interacted with from outside the host API implementation.

@tschneidereit
Copy link
Member Author

@guybedford while this isn't completely done, and in dire need of cleanup, it is most of the way there, I think. I'm sharing it now because the host API changes are bound to impact you quite a bit. I hope they're for the better, and I at least found them to be necessary to get back to a point where I'm able to reason through the management and lifecycles of resource handles.

tschneidereit and others added 14 commits May 29, 2024 13:32
SpiderMonkey uses the `encoding_c` Rust crate and comes bundled with it. This lead to either duplication or linking errors when using our own crate re-exporting `encoding_c`.

Instead, with this PR we just bundle a C header file for interacting with the crate's functionality, and rely on SpiderMonkey's version.
`HostString` converts to `string_view`, so this makes `decode` easy to use from more places.
…ring wizening

The idea is that with this, builtins handling runtime events can dynamically load and execute a top-level script as needed.

This functionality isn't actually used in this commit, but will be in another commit changing how incoming HTTP requests are handled.
Gets back to passing all current tests, and most of the WPT suite that Fastly's JS Compute runtime passes.
This changes the implementation of the handle abstraction to make it more robust, and cleans it up at least somewhat.
@tschneidereit tschneidereit marked this pull request as ready for review June 2, 2024 20:19
@tschneidereit tschneidereit requested a review from guybedford June 2, 2024 20:19
@tschneidereit
Copy link
Member Author

@guybedford I think this is now ready for review. The resource handling is much cleaned up, and I addressed all the regressions of WPT tests that make sense to do here, relative to the state of the JS Compute Runtime I started with.

(The only things missing that that passes are around some convenience constructors such as Response.redirect(), which I'm not tackling here.)

@nicoburniske
Copy link

Hey @tschneidereit, I tried out your branch with some e2e tests that I created in my branch. There looks to be a problem with attaching a body to a fetch request

The error I encountered was

stderr [0] :: event loop error - both task and job queues are empty, but expected operations did not resolve
stderr [0] :: Internal error.

Here's the full e2e test code. I can put up a branch if you'd like

async function main(event) {
    let resolve, reject;
    
    try {
        let responsePromise = new Promise(async (res, rej) => {
            resolve = res;
            reject = rej;
        });
        event.respondWith(responsePromise);

        let body = `{ "hello": "world" }`
        
        // Error occurs here
        let response = await fetch("https://echo.free.beeceptor.com", {
            method: "POST",
            headers: {
                "Content-Type": "application/json"
            },
            body
        })

        // This never logs
        console.log("Successfully sent request to echo server")

        let responseJson = await response.json();
        
        let result = {
            method: responseJson.method,
            parsedBody: responseJson.parsedBody,
        };

        console.log("Successfully received response json body");
        resolve(new Response(JSON.stringify(result)));
    } catch (e) {
        console.log(`Error: ${e}. Stack: ${e.stack}`);
    }
}

addEventListener('fetch', main);

@tschneidereit
Copy link
Member Author

Thank you for letting me know, @nicoburniske! Fascinating, given that I ran a whole bunch of WPT tests around fetch that seemed to say it works. I'll investigate.

@tschneidereit
Copy link
Member Author

@nicoburniske thank you again for the report! Turns out I had read some of the WPT outputs wrong, and there indeed was a bunch more stuff to fix 😃

@nicoburniske
Copy link

@tschneidereit Happy to help! Excited for these fixes to ship 🚀

Copy link
Contributor

@guybedford guybedford left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I briefly tested this out in ComponentizeJS and can confirm fetch seems to behave correctly, down to some possible ComponentizeJS cases to work through.

@guybedford
Copy link
Contributor

guybedford commented Jun 4, 2024

One issue I did hit with this example:

const res = await fetch('https://www.google.com');
const text = await res.text();
console.log(text);

is that it throws (new TypeError("malformed UTF-8 character sequence at offset 1", ""))

It could be worth verifying decompression is properly supported, otherwise this may be a ComponentizeJS / Jco bug.

@tschneidereit
Copy link
Member Author

@guybedford ah, interesting. The fetch implementation indeed doesn't do transparent decompression. But usually that shouldn't matter either, because we're not setting the Accept-Encoding header. Are you setting that, by any chance?

@guybedford
Copy link
Contributor

@tschneidereit because we virtualize on top of Node.js's fetch implementation it might be overriding that I expect, posted bytecodealliance/jco#447 to investigate further.

@guybedford
Copy link
Contributor

Actually I spoke too soon. Running against a test server I'm getting mangled response body data. Headers are just {"host":"localhost:8080","connection":"keep-alive"}. But the body the test server is sending back doesn't seem to reflect at all. Again, this could be ComponentizeJS / Jco toolchain specific, but I'd have to dig into it more. We did pass all the WASI HTTP tests though, so I haven't seen any mangled response issues before, unless there are gaps in the test suite.

…bodies

This special-casing used to make sense in the JS-Compute runtime, where certain TransformStreams were more optimizable. It doesn't anymore, and in any case would have to come with actually reading from the stream, which the code as written before this patch didn't 🤷
@tschneidereit
Copy link
Member Author

@guybedford I fixed a whole bunch of additional issues now. Would be great if you could take a look at the additional commits I pushed since your review. Then we should land this, and move everything else into additional PRs. (Of which I'll open a few in a moment, stacked on top of this one 😄)

@tschneidereit tschneidereit requested a review from guybedford June 18, 2024 13:15
Copy link
Contributor

@guybedford guybedford left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still getting those weird issues with streaming when testing this in ComponentizeJS. I don't know if they are ComponentizeJS or Jco bugs though. I'm assuming streaming res.text() is properly captured in the WPT tests though? If so, what might be necessary to properly debug these cases is to try running the WPT using the Jco host instead of Wasmtime, to confirm the scenarios, before investigating if this is a ComponentizeJS issue, raising any upstream issue as necessary.

I have also tested this on Fastly's StarlingMonkey port and can confirm all still works correctly.

Would be great to land this now though and continue iterating with follow-ups as necessary.

runtime/engine.cpp Show resolved Hide resolved
ENGINE = engine;
FETCH_HANDLERS = new JS::PersistentRootedObjectVector(engine->cx());
bool handle_incoming_request(host_api::HttpIncomingRequest * request) {
if (!ENGINE->toplevel_evaluated()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What case is this branch for?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I extracted this out into a function with a short doc comment explaining the purpose.

builtins/web/fetch/request-response.cpp Outdated Show resolved Hide resolved
@tschneidereit
Copy link
Member Author

Thank you for the review! I pushed a commit addressing your comments. While there's a bunch of code motion in that commit, it's all trivial, so I'll go ahead and land this.

@tschneidereit tschneidereit merged commit 63a0464 into main Jun 21, 2024
1 check passed
@tschneidereit tschneidereit deleted the fetch-rework branch June 21, 2024 11:49
This was referenced Jul 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants