Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Streaming of request and response payloads #695

Closed
sbeckmann opened this issue Nov 23, 2018 · 23 comments · Fixed by #1824
Closed

Streaming of request and response payloads #695

sbeckmann opened this issue Nov 23, 2018 · 23 comments · Fixed by #1824
Assignees
Labels
feature A new feature high High priority merged Issue has been merged to dev and is waiting for the next release MVP Minimum Viable Product Nov'23 November 2023 release

Comments

@sbeckmann
Copy link

Hi,
we tested ocelot in front of a dotnet core API for uploading files. The setup is WebApp (Razor Page) -> Ocelot -> API, each part running in its own linux docker container on Windows. Apparently, uploaded files are buffered entirely in memory in Ocelot before they are passed on to the backend. I recon responses are treated the same way, although we didn't test that yet. I couldn't find any configuration setting to enable request or response streaming.

Is there a setting i missed or might we be using Ocelot in a wrong way? If not, are there any plans to add this feature?

Thanks for any advice or help :)

@philproctor philproctor added feature A new feature help wanted Not actively being worked on. If you plan to contribute, please drop a note. needs validation Issue has not been replicated or verified yet labels Jan 10, 2019
@Lybr4
Copy link

Lybr4 commented Apr 30, 2019

We are having the same issue with our API providing access to both uploading and downloading files. The request/response is buffered entirely into memory, causing heavy issues in our cloud environment with limited resources.

@ducle83
Copy link

ducle83 commented Jul 14, 2019

Our team having the same issue too, when trying to upload large files (e.g.: video).

@dreamfalcon
Copy link

Same problem here with content_type="text/event-stream".
The gateway only returns the full response at the end.

@TomPallister
Copy link
Member

Yeh I agree this needs to be changed and has been annoying me ever since I made Ocelot!

The problem is that Ocelot expects data to be there to do transformations etc anyone got any ideas how we get around this e.g. certain use cases use a stream, and only read it if you need to mess with the data?

@kvskranthikumar
Copy link

Today we hit this issue, any update on how will this be handled.

Uploading a huge file (2GB +) to an API through Gateway throws an exception as
exception: System.IO.IOException: Stream was too long.

@TomPallister
Copy link
Member

Yes we should have a way of doing a stream if no transforms are required. If someone submits a PR for this we can accept it.

@zaheershk
Copy link

@TomPallister when you say "if no transforms are required", could you guide me as to where the "transforms" are being handled within Ocelot? I'd like to submit a PR for this fix.

@TomPallister
Copy link
Member

@zaheershk

Ocelot does a bunch of processing on the incoming HTTP request and HTTP response from the server such as working with headers, changing the path & content. All of this is optional based on configuration. Ocelot just assumes this functionality is being used so always reads the HTTP request / response stream into memory even if it isn't doing anything interesting with it. If the user doesn't want these features than it should be possible to not read the stream into memory.

I think this might actually be a much more complicated refactor than I initially thought after you have asked me that question. I would need to look into this more to understand if it is possible with the current architecture and how you would do it with dotnet core.

I guess its possible to stream the incoming HTTP request straight out to the downstream server and stream that response straight back to the client. However I don't know what the code needs to look like. Once this is worked out I think there are many places in the code that would need to be refactored you might even say it needs an architecture refactor :)

@abriening
Copy link

I could only find two places that read the request body.

In the RequestMapper:

var content = new ByteArrayContent(await ToByteArray(request.Body));

Which looks like a fix for #464. I have a fix for that in this unrelated Issue/MR #1432

And in the CacheKeyGenerator:

string requestContentString = Task.Run(async () => await downstreamRequest.Content.ReadAsStringAsync()).Result;

The caching would only be enabled with UseOutputCacheMiddleware. That code has a blocking call that could be refactored to async/await, and the MD5 code could be refactored to use an output buffer instead of reading the stream all at once. I probably wouldn't cache anything other than GET and HEAD anyway, but that's another issue.

Looks like response handling uses CopyToAsync which should be performant.

@meliky
Copy link

meliky commented Apr 27, 2021

Is there any progress on this issue? It has been a long time. Streaming data is crucial part of our project. Is there at least a work around for this problem?

@sarvasana
Copy link

sarvasana commented Apr 30, 2021

Is there any progress on this issue? It has been a long time. Streaming data is crucial part of our project. Is there at least a work around for this problem?

I think you have to build it yourself and submit a PR.
The issue was raised in 2018.
Tagged with (help wanted) in 2019.
We are now in 2021.

@meliky
Copy link

meliky commented May 10, 2021

I didn't pay attention to dates and tags. I am sorry. I have made two changes to solve our problem. I don't think it is a generic solution so I haven't submited PR.
Changes are:
1.
#

// Never change this to StreamContent again, I forgot it doesnt work in #464.
var content = new ByteArrayContent(await ToByteArray(request.Body));

changes to

            HttpContent content = request.ContentType == "application/octet-stream"
                ? new StreamContent(request.Body)
                : new ByteArrayContent(await ToByteArray(request.Body));

I know that StreamContent has a problem with #464 therefore, I used only when content type is "application/octet-stream".

public Task<HttpResponseMessage> SendAsync(HttpRequestMessage request, CancellationToken cancellationToken = default)
{
return Client.SendAsync(request, cancellationToken);
}

changes to

        public Task<HttpResponseMessage> SendAsync(HttpRequestMessage request, CancellationToken cancellationToken = default)
        {
            return Client.SendAsync(request, HttpCompletionOption.ResponseHeadersRead, cancellationToken);
        }

With HttpCompletionOption.ResponseHeadersRead option , HttpClient returns response without reading response body.

@abriening
Copy link

I've found that even with updates to remove full in-memory reads of the upstream response, with very large requests (like file downloads) the memory balloons up to 5x in some quick testing. So a 20Mb response adds 100Mb at runtime. I've had much better luck with https://github.com/microsoft/reverse-proxy. It does not use the HttpClient directly, uses HttpMessageInvoker with an auto-flushing stream copy that keeps memory low, even for very large responses.

@mahgo
Copy link

mahgo commented May 13, 2022

Is changing the Content in RequestMapper.cs from var content = new ByteArrayContent(await ToByteArray(request.Body)); to var content = = new StreamContent(request.Body); still an issue as per #490?

I've tried this to my route after changing the above but it still works without issue:

"AuthenticationOptions": {
  "AuthenticationProviderKey": "Bearer",
  "AllowedScopes": []
}

@CobusKruger
Copy link

@TomPallister do you see any obvious issues with the solution proposed by @meliky? I thought of just making a copy of RequestMapper, making his change, and registering it with AddSingleton, but the HttpClientWrapper change can't be done that way.

@CobusKruger
Copy link

@meliky Have you had to do anything else to get it working on your end? For me, the connection seems to close automatically a few seconds after, and the streaming stops.

@raman-m raman-m added large effort Likely over a week of development effort medium effort Likely a few days of development effort MVP Minimum Viable Product labels Aug 9, 2023
@ThreeMammals ThreeMammals deleted a comment from kvskranthikumar Nov 9, 2023
@ThreeMammals ThreeMammals deleted a comment from evgslyusar Nov 9, 2023
@raman-m raman-m added the high High priority label Nov 9, 2023
@akinihsan
Copy link

akinihsan commented Jan 2, 2024

@raman-m @CobusKruger
any update on this issue?
We are facing memory problems while uploading large files through ocelot.

@raman-m
Copy link
Member

raman-m commented Jan 2, 2024

@abriening commented on Feb 8, 2021
@meliky commented on May 10, 2021

Hi @ggnaegi !
Seems we 've fixed content streaming issues by #1724, right?
Could you check the user scenarios and suggestions above from the community plz? I hope we've fixed and included these user cases in our current release by #1724... If Not, we need to brainstorm possible solutions right in this release, because streaming issue must work for all content types. Please, double check content types from the contributor's user cases! Thank you!

After your design review of this issue, we will decide what to do next: attach this to #1724 and close this issue, or keep it open and prepare some quick PR with a fix...

@raman-m raman-m self-assigned this Jan 2, 2024
@raman-m raman-m added this to the Nov-December'23 milestone Jan 2, 2024
@raman-m raman-m added the Nov'23 November 2023 release label Jan 2, 2024
@raman-m
Copy link
Member

raman-m commented Jan 2, 2024

Sorry, guys! I've totally lost the thing... As a team, we will double check this issue in current release once again.
But I hope the issue has been already fixed by PR #1724 but it's not released.
I recommend you to checkout develop branch and compile the solution making the DLLs manually, and then attach to your projects as a DLL reference removing Ocelot NuGet pack reference.
The official release will take place in a week...


@CobusKruger commented on Oct 6, 2022

Tom lives in a "far-far away" Kingdom! 🤣 I'm Tom's representative. 😉


@akinihsan commented on Jan 2

Updates are above! ☝️ Please make Ocelot DLLs temporarily compiling develop branch code, and let us know your test results please!

@raman-m
Copy link
Member

raman-m commented Jan 2, 2024

@sbeckmann commented on Nov 23, 2018

Dear Simon,
As an author, Are you still with Ocelot?

@raman-m
Copy link
Member

raman-m commented Jan 4, 2024

@RaynaldM Welcome to official testing stage!
I would love if you attach logs and K8s graphs of memory consuming from your production env when pumping large files.
I believe you need just deploy current develop version...
Thank you!

P.S.

Technically we could develop manual tests in Ocelot.ManualTest project, run console session and pump large files. But you know I'm overloaded with current release activilities & tasks... I will think more, should we update manual tests or not...

@raman-m raman-m linked a pull request Jan 4, 2024 that will close this issue
@raman-m raman-m added 2023 Annual 2023 release and removed Nov'23 November 2023 release labels Jan 4, 2024
@raman-m raman-m modified the milestones: Nov-December'23, Annual 2023 Jan 4, 2024
@raman-m raman-m modified the milestones: Annual 2023, Nov-December'23 Jan 17, 2024
@raman-m raman-m added Nov'23 November 2023 release and removed 2023 Annual 2023 release help wanted Not actively being worked on. If you plan to contribute, please drop a note. large effort Likely over a week of development effort medium effort Likely a few days of development effort needs validation Issue has not been replicated or verified yet labels Jan 17, 2024
raman-m added a commit that referenced this issue Jan 18, 2024
* first version

* first working version of the new http client pool

* Some cleanup, removing classes that aren't used

* some more cleanup

* forgot passing PooledConnectionLifetime

* adding todo for connection pool and request timeouts

* some code cleanup

* ongoing process refactoring tests

* a little mistake with big effects

* Several refactorings, disposing http response message to ensure that the connection is freed. Moving errors from Polly provider to Errors\QoS.

* providing some comments

* adding response body benchmark

* some minor changes in MessageInvokerPool.

* using context.RequestAborted in responder middleware (copying the response body from downstream service to the http context)

* Fix style warnings

* code review

* moving response.Content.ReadAsStreamAsync nearer to CopyToAsync with using.
Making sure, that the content stream is disposed

* HttpResponse.Content never returns null (from .net 5 onwards)

* adding more unit tests (validating, log warning if passthrough certificate, cookies are returned, message invoker timeout)

* adding a tolerance margin

* adding new acceptance test, checking memory usage. Needs to be compared with current implementation first.

* Review tests by @raman-m

* Update src/Ocelot/Configuration/HttpHandlerOptions.cs

* Update src/Ocelot/Middleware/DownstreamResponse.cs

* Update src/Ocelot/Requester/MessageInvokerPool.cs

* Update src/Ocelot/Requester/HttpExceptionToErrorMapper.cs

* Update src/Ocelot/Responder/HttpContextResponder.cs

* Update src/Ocelot/Middleware/DownstreamResponse.cs

* Use null-coalescing operator for `Nullable<int>` obj

* some modifications in the acceptance test, adding a tolerance of 1 Mb

* finalizing content tests. Making sure, that the downstream response body is not copied by the api gateway.

* adapting tolerances

* Disable StyleCop rule SA1010 which is in conflict with collection initialization block vs whitespace.
More: https://github.com/DotNetAnalyzers/StyleCopAnalyzers/blob/master/documentation/SA1010.md

* Refactor Content tests

---------

Co-authored-by: Raman Maksimchuk <[email protected]>
@raman-m raman-m added the merged Issue has been merged to dev and is waiting for the next release label Jan 18, 2024
@IamMartinPeek
Copy link

@raman-m
I integrated the new release today. It works like a charm. Thanks so much for your work! Finally low memory in gateways!

@raman-m
Copy link
Member

raman-m commented Feb 16, 2024

@IamMartinPeek Actually say Thanks to @ggnaegi who is the author of new design of system core. 😉 He is Top 1 Contributor of the current 23rd release.

But what's your project? I didn't find something related to Ocelot in your repos... Private project?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature A new feature high High priority merged Issue has been merged to dev and is waiting for the next release MVP Minimum Viable Product Nov'23 November 2023 release
Projects
None yet
Development

Successfully merging a pull request may close this issue.