Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Canceled GCS Multipart upload leaks HTTP2 streams and leads to REFUSED_STREAM errors #1211

Closed
KJTsanaktsidis opened this issue Nov 6, 2018 · 7 comments
Assignees
Labels
api: storage Issues related to the Cloud Storage API. status: investigating The issue is under investigation, which is determined to be non-trivial.

Comments

@KJTsanaktsidis
Copy link

Client

Google Cloud Storage

Describe Your Environment

SDK is used in Hashicorp Vault running on Ubuntu 18.04, on GCE

Expected Behavior

When canceling a context passed to ObjectHandle.NewWriter(ctx), all resources associated with the request should be cleaned up and it won't affect subsequent operations

Actual Behavior

I have a minimal reproduction test case here: https://github.com/KJTsanaktsidis/refused_stream_repro

The reproduction tries to upload lots of files to GCS with ObjectHandle.NewWriter(ctx), and cancels some of the contexts at random times. Eventually, after running for a few minutes, all uploads start returning the following error:

Post https://www.googleapis.com/upload/storage/v1/b/kjs_cool_vault_bucket/o?alt=json&prettyPrint=false&projection=full&uploadType=multipart: stream error: stream ID 45457; REFUSED_STREAM

I used a packet capture to debug the communication between Vault and GCS, and I found that the SDK would often

  • Create a new HTTP2 stream to perform the multipart uplaod
  • Send a HEADERS frame for POST /upload/storage/v1/b/vaultgcs_backend_us1_staging/o?alt=json&prettyPrint=false&projection=full&uploadType=multipart
  • Never send a subsequent DATA frame for the request body, and
  • Never send a subsequent RST_STREAM frame to clean up the stream

So, eventually, the GCS server starts sending REFUSED_STREAM errors back to the client when any new upload attempt is made, because the number of forgotten-about upload streams has exceeded the servers HTTP2 MAX_CONCURRENT_STREAM value of 100.

This issue seems to be the cause of hashicorp/vault#5419

@enocom enocom added api: storage Issues related to the Cloud Storage API. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns. labels Nov 6, 2018
@enocom
Copy link
Member

enocom commented Nov 6, 2018

This is likely related to #753, as well.

@enocom enocom added status: investigating The issue is under investigation, which is determined to be non-trivial. and removed type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns. labels Nov 6, 2018
@enocom enocom self-assigned this Nov 6, 2018
@enocom
Copy link
Member

enocom commented Nov 6, 2018

Another related issues: golang/go#20985.

@enocom
Copy link
Member

enocom commented Nov 7, 2018

Thanks for the reproduction @KJTsanaktsidis. It's been very helpful.

Meanwhile, here's another related issue: golang/go#27208.

@KJTsanaktsidis
Copy link
Author

Yup - golang/go#27208 is exactly it!

I tried my reproduction using golang.org/x/net from the provided PR from that issue golang/net#18 (see branch working_version in my repo). This seems to have fixed the issue - I no longer got STREAM_REFUSED errors!

So I think this isn't a bug in google-cloud-go at all, and should go away on go 1.12?

@enocom
Copy link
Member

enocom commented Nov 7, 2018

Well done!

Provided the patch lands in Go 1.12, yes, the problem should go away. I'll ping the Go issue.

@KJTsanaktsidis
Copy link
Author

Awesome - thanks heaps for your help in joining these dots!

@enocom
Copy link
Member

enocom commented Nov 7, 2018

Since this is an issue outside this repo, I'm going to close it.

Thanks again for all your help getting to the bottom of this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: storage Issues related to the Cloud Storage API. status: investigating The issue is under investigation, which is determined to be non-trivial.
Projects
None yet
Development

No branches or pull requests

2 participants