-
-
Notifications
You must be signed in to change notification settings - Fork 133
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prevent server hangs on unconsumed streams. #81
Prevent server hangs on unconsumed streams. #81
Conversation
# Conflicts: # changelog.md
The purpose of the `build:prettier` script is cleanup the un-pretty Bablel output in `lib`, so changing the glob to `src` doesn’t make sense. The `lint:prettier` glob includes `**/` so that it searches nested directories.
- Improved comment. - More efficient noop function.
OK, I've added tests for the koa middleware. The problem with doing this in express is that we're going to need to export 2 different middlewares:
|
OK, so the failing tests appear to happen on master using the node v10 and v8. This is good to know, as I was planning to upgrade from v9 shortly! I'm going to look into this as well. |
OK, so it looks like #80 fixes master on node v10, so I'm going to go ahead and pull that one into this branch as well. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I'm understanding it correctly, this delays resolvers until all of the uploads have been received by the server? I don't think this is the right approach for a few reasons.
It removes the primary benefit of the latest streaming API of apollo-upload-server
, shorter response times due to a reduced waterfall uploading to cloud:
If a user is uploading a lot, with the async process a disconnect partway though would result in a partial success rather than a total failure.
Waiting for the uploads to complete is a DoS risk.
If your service takes large uploads (MB, even GB) then would the RAM get smashed, since it is buffering all of the files?
One of the user stories we would like to ultimately achieve, is to allow validation of arguments in the resolvers for an early error response before all the files have uploaded. Imagine a user fills in a form for an invoice on their mobile, and attaches a 3mb photo of the invoice as one of the form fields. It turns out they had a validation typo in an email address field. Ideally they would get a near instant error payload back, so they don't have to wait 3 minutes for the upload to needlessly complete so they can try again.
The fact that some browsers don't respect the HTTP spec and have a client-side fetch error when there is an early response is not really a major concern for us. Our main concern is that the server has rock solid stability and doesn't crash. Reading through the browser bugs, it appears to be resolved with HTTP/2.
A lot of people have the JWT/auth middleware at the start of the middleware chain, so I don't think we can reasonably prevent other middleware from causing early responses. Best we can do is not crash. Besides, we need to handle the client aborting half way through an upload.
src/middleware.mjs
Outdated
return new Promise(resolve => { | ||
upload.file.stream.on('end', resolve) | ||
upload.file.stream.on('error', resolve) | ||
if (!upload.file.stream.readableFlowing) upload.file.stream.resume() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks tricky, what is it for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't want to interfere with any streams that are being consumed, but we do want to run through the rest. Using readableFlowing let's us know what state the stream is in.
src/test.mjs
Outdated
@@ -101,6 +102,66 @@ t.test('Single file.', async t => { | |||
}) | |||
}) | |||
|
|||
t.test('Early response.', async t => { | |||
t.jobs = 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
t.jobs
is only necessary when there are multiple subtests. Parallelism is default for top level tests, but child tests a sync unless you set t.jobs
> 1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great to know! (I'm new to tap.)
src/test.mjs
Outdated
|
||
const data = fs.readFileSync(TEST_FILE_PATH) | ||
|
||
var requestHasFinished = false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better to use let
. You probably don't even need the = false
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good; not to get side-tracked, but is there a benefit to let
over var
for top-level function-scoped variables, other than stylistic consistency?
src/test.mjs
Outdated
var requestHasFinished = false | ||
const stream = new Readable() | ||
stream.path = TEST_FILE_PATH | ||
stream._read = () => {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is this about? The _
appears to indicate a private method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the old, but still-supported way of implementing streams in node (for Streams 2, which is still the current implementation).
src/test.mjs
Outdated
stream.push(null) | ||
}, 10) | ||
|
||
return promise |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer await
to return
here, as it makes it clear the value isn't used back up the chain somehow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good!
src/test.mjs
Outdated
) | ||
}, | ||
err => { | ||
throw err |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would this promise block would be more elegant as an await
with no surrounding try/catch, since we throw on error anyway?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Totally!
package.json
Outdated
@@ -65,9 +65,9 @@ | |||
"build:clean": "rm -r lib; mkdir lib", | |||
"build:mjs": "BABEL_ESM=1 babel src -d lib --keep-file-extension", | |||
"build:js": "babel src -d lib", | |||
"build:prettier": "prettier 'lib/**/*.{mjs,js}' --write", | |||
"build:prettier": "prettier 'src/**/*.{mjs,js}' --write && prettier '*.{json,md}' --write", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please undo all these script changes; for an explanation see 8928ce0.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apologies! Will fix.
src/middleware.mjs
Outdated
@@ -94,7 +94,7 @@ export const processRequest = ( | |||
operationsPath.set(path, map.get(fieldName).promise) | |||
} | |||
|
|||
resolve(operations) | |||
resolve({ operations, map }) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a breaking change to the API. A fair few people such as Apollo use processRequest
directly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good point; I will change how this works.
package.json
Outdated
"lint:eslint": "eslint . --ext mjs,js", | ||
"lint:prettier": "prettier '**/*.{json,md}' -l", | ||
"lint:prettier": "prettier '*.{json,md}' -l", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto, undo.
I've just fixed a bug with the tests in master branch (528216f), might want to rebase. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks SO much for the super thorough (and incredibly prompt) response! I've gone through your review and left comments where appropriate. I'm going to go ahead and make all the suggested changes shortly.
I did want to correct one thing that I wasn't entirely clear about: this suggested change does not delay resolvers. Rather, it makes sure that the incoming request is complete before sending the response. It doesn't interfere with streams that are currently being consumed, but does resume
those that aren't. Nothing is buffered in memory differently than it was, so this doesn't add that kind of DOS vector, although it would potentially allow an attacker to occupy the network for longer.
I completely agree with you that this is something that should be fixed in browsers... but I tested the exact same behavior in Chrome, Firefox, and Safari on mac, and Safari on iOS. At the end of the day, I just needed to be able to send responses back to the client :-/
Your comment about HTTP2 is very encouraging, and it's the next thing I'll try. If that behaves correctly in modern browsers, then I'll gladly replace this PR with a doc change suggesting users use HTTP2 if possible :)
src/middleware.mjs
Outdated
return new Promise(resolve => { | ||
upload.file.stream.on('end', resolve) | ||
upload.file.stream.on('error', resolve) | ||
if (!upload.file.stream.readableFlowing) upload.file.stream.resume() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't want to interfere with any streams that are being consumed, but we do want to run through the rest. Using readableFlowing let's us know what state the stream is in.
src/middleware.mjs
Outdated
@@ -94,7 +94,7 @@ export const processRequest = ( | |||
operationsPath.set(path, map.get(fieldName).promise) | |||
} | |||
|
|||
resolve(operations) | |||
resolve({ operations, map }) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good point; I will change how this works.
package.json
Outdated
@@ -65,9 +65,9 @@ | |||
"build:clean": "rm -r lib; mkdir lib", | |||
"build:mjs": "BABEL_ESM=1 babel src -d lib --keep-file-extension", | |||
"build:js": "babel src -d lib", | |||
"build:prettier": "prettier 'lib/**/*.{mjs,js}' --write", | |||
"build:prettier": "prettier 'src/**/*.{mjs,js}' --write && prettier '*.{json,md}' --write", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apologies! Will fix.
src/test.mjs
Outdated
@@ -101,6 +102,66 @@ t.test('Single file.', async t => { | |||
}) | |||
}) | |||
|
|||
t.test('Early response.', async t => { | |||
t.jobs = 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great to know! (I'm new to tap.)
src/test.mjs
Outdated
|
||
const data = fs.readFileSync(TEST_FILE_PATH) | ||
|
||
var requestHasFinished = false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good; not to get side-tracked, but is there a benefit to let
over var
for top-level function-scoped variables, other than stylistic consistency?
src/test.mjs
Outdated
var requestHasFinished = false | ||
const stream = new Readable() | ||
stream.path = TEST_FILE_PATH | ||
stream._read = () => {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the old, but still-supported way of implementing streams in node (for Streams 2, which is still the current implementation).
src/test.mjs
Outdated
) | ||
}, | ||
err => { | ||
throw err |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Totally!
src/test.mjs
Outdated
stream.push(null) | ||
}, 10) | ||
|
||
return promise |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good!
@mike-marcacci really appreciate your help 👏 I'm on the Apollo slack (and I can see you're on there too), so feel free to chat about any of this stuff, either publicly in the |
OK, well, as an update, I wasn't able to get any success with http2, but my real world use case is complex (reverse proxies, etc) so I thought I should go back to first principles. I created a test to try against the browsers on my laptop: https://gist.github.com/mike-marcacci/5c7f435d51240feb8a871ab348a00dfd And I can't seem to reproduce the issue on http, https or http2, so I've got to keep digging. I'll post back when I find something. |
src/middleware.mjs
Outdated
@@ -22,25 +25,30 @@ class Upload { | |||
this.done = true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like this is not used anywhere anymore?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, that's correct! Removed.
I need this 👀 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🙌
@@ -38,6 +38,7 @@ | |||
}, | |||
"dependencies": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No more @babel/runtime
! Yeah! This will remove 3MB from my production docker containers. Thanks!
HURRAY! @jaydenseric when to expect this to be tagged on npm? |
@terion-name I have a few minor things to sneak in, then publish hopefully within the next day. The new version will likely be a semver major, due to the raised min Node.js version. If you use a recent Node.js version it should be a painless upgrade. |
|
||
const finished = new Promise(resolve => request.on('end', resolve)) | ||
const { send } = response | ||
response.send = (...args) => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this monkey patching the send
method for all following express middleware? This seems pretty cheeky.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ya, this a really ugly thing to do, but it’s the only way I know to “wrap” child middleware in express/connect without requiring them to be aware of the parent middleware... I’ve reluctantly used this pattern in the past and haven’t really run into issues, but if you know of any other way to do this, I’d love to learn!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Of the top of my head it might break res.write()
: https://stackoverflow.com/questions/44692048/what-is-the-difference-between-res-send-and-res-write-in-express
But I don't know any existing code bases (out of dozens I overview or maintain) which use res.write()
with express.js.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@koresar you're definitely correct that calling res.write()
could break some assumptions, and also that doing so is a kind of escape hatch from express. In fact, calling ctx.req.write()
would have unintended results in koa as well.
@mike-marcacci do you know what might be causing these errors? https://travis-ci.org/jaydenseric/apollo-upload-server/jobs/402915414#L2165 |
I’m going to assume you’re referring to the “Parse Error” in the logs (Travis line links are apparently messed up on mobile). I don’t know off the top of my head, but I suspect those are a case where we’re testing an aborted request, and we propagate our custom events while busboy emits a parse error. I’m not sure how those would get in the logs, but I assume there’s something special going on with Tap. I’ll look at these tomorrow. |
when are you gonna release this? |
@ForsakenHarmony A few unanticipated issues have popped up. Once we know what is causing the errors mentioned above, and once #89 (comment) is fixed and merged we can publish v6.0.0-alpha.1. Help resolving these blockers is the fastest path to a release, otherwise it is mostly up to @mike-marcacci's availability. If there is no movement this week I will try to take a day or more off from project work to see what more I can do. Earlier versions had a simpler (although less efficient) approach using formidable and temp files. I would love to have more Node.js stream experts come on to the project as collaborators. |
@ForsakenHarmony @jaydenseric you're in luck, I've got some time here today :) we didn't have as great cell coverage in our last locations as we had expected. Stay tuned! |
Published in v6.0.0-alpha.1 🎉 |
seems like you published it as a normal release on npm not alpha/beta, just as a heads up |
Thanks for the heads-up @ForsakenHarmony. You're right: it looks like @jaydenseric will need to update npm's "latest" tag to point back at 5.0.0. |
Thats deliberate actually; v6.0.0-alpha.1 is more stable than v5.0.0 and everyone who can (given the few breaking changes) is advised to update. |
Aah I see. Good strategy!
… On Jul 23, 2018, at 19:42, Jayden Seric ***@***.***> wrote:
Thats deliberate actually; v6.0.0-alpha.1 is more stable than v5.0.0 and everyone who can (given the few breaking changes) is advised to update.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.
|
While the HTTP spec does support sending a response before the request is finished and this behavior is followed by node http and curl, every browser and at least one popular load balancer (GCP) don't support this.
This means that if the server responds before the request stream has finished, the response body will be dropped:
If, on the other hand, the server waits before for the request to finish (by dumping the incoming data), the client is able to receive the response:
The reason this effects
apollo-upload-client
differently than other upload solutions is twofold:apollo-upload-server
, most other drop-in file upload middleware waits for all uploads to finish before handing off the request to downstream middleware. This makes it simple, as downstream middleware doesn't have to worry about consuming the entire stream.graphql
server, and resolvers are unaware ofUpload
s from other graphql branches (you could hack around this, but it would undo all the ergonomic benefits ofapollo-upload-server
).Here's my (very typical) use case:
Unfortunately, this is a bit challenging to test, as this is a race: if the upload is fully buffered by node http before the response is sent (small file, local network) then everything works correctly; if it isn't (most real-world cases) then it will fail.
So far I've implemented this in the koa middleware (since that's what I'm using), and will add this to express when I get a chance. I'll also add some tests, but I wanted to drop this off here before somebody else descends the rabbit hole trying to diagnose this.