Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

http-conduit: Store the cleanup actions of responses in ResourceT #539

Merged
merged 1 commit into from
Aug 26, 2024

Conversation

JackKelly-Bellroy
Copy link
Contributor

Because ResourceT needs to hold onto the Response so it can perform cleanup when leaving runResourceT, move the cleanup action from the Response itself and register it with ResourceT. Then all code paths which trigger cleanup (leaving ResourceT, consuming the body, calling responseClose) will do so by releasing the Response from the ReleaseMap, preventing a space leak.

Fixes #537

@JackKelly-Bellroy
Copy link
Contributor Author

JackKelly-Bellroy commented Jul 5, 2024

We're testing this for now, but will mark it as ready once we're sure it works. Opinions and comments welcome.

Because `ResourceT` needs to hold onto the `Response` so it can
perform cleanup when leaving `runResourceT`, move the cleanup action
from the `Response` itself and register it with `ResourceT`. Then all
code paths which trigger cleanup (leaving `ResourceT`, consuming the
body, calling `responseClose`) will do so by releasing the `Response`
from the `ReleaseMap`, preventing a space leak.
@JackKelly-Bellroy JackKelly-Bellroy force-pushed the http-cleanup-action-space-leak branch from d7e429d to dc2051a Compare July 5, 2024 03:09
@kokobd
Copy link

kokobd commented Jul 9, 2024

Here is the test program.

{-# LANGUAGE OverloadedLabels #-}
{-# LANGUAGE OverloadedStrings #-}

module Main where

import Amazonka qualified
import Amazonka.S3 qualified as S3
import Control.Monad (void)
import Control.Monad.Random.Class
import Control.Monad.Trans.Class (lift)
import Data.Generics.Labels ()
import Data.Text qualified as T
import Data.Traversable (for)

main :: IO ()
main = do
  env <- Amazonka.newEnv Amazonka.discover
  void . Amazonka.runResourceT $
    for (replicate 1000 ()) $ \_ -> do
      randomStr <- fmap (T.pack . take 10000000) . lift $ getRandomRs ('a', 'z')
      lift $ putStrLn "Sending 10MB of data"
      void . Amazonka.send env $
        S3.newPutObject "bellroy-eventlog-test" "test" (Amazonka.toBody randomStr)

Test command: AWS_PROFILE=xxx AWS_REGION=us-east-2 cabal run --enable-profiling exe:amazonka-profiling -- +RTS -l -hT

Below are eventlog htmls before and after this PR:
image
image

@JackKelly-Bellroy JackKelly-Bellroy marked this pull request as ready for review August 26, 2024 03:57
@JackKelly-Bellroy
Copy link
Contributor Author

This definitely improves things, so I'm un-marking it as draft.

@snoyberg
Copy link
Owner

I'm not following the change here. The existing allocate call should be doing exactly the same registration with ResourceT. Is it possible that this is a subtle laziness bug that the rewrite is fixing? I'm hesitant to merge this as-is given that I don't understand why it helps.

@JackKelly-Bellroy
Copy link
Contributor Author

JackKelly-Bellroy commented Aug 26, 2024

In http-conduit as it currently stands, calling responseClose on a Response will close the connection, but you leak space because ResourceT needs to hold onto it until execution leaves the enclosing runResourseT. This comes up in e.g., a bunch of amazonka code, where it optimistically closes connections that don't return streaming responses, but you get space leaks until execution leaves the ResourceT.

This PR changes that by replacing the responseClose' field on the connection, so if someone calls responseClose, it asks ResourceT to release the connection. This means closed Responses can be GC'd before execution has left the enclosing ResourceT.

We can't use allocate (Client.responseOpen req man) Client.responseClose because we don't want Client.responseClose to be the registered cleanup action — that would cause an infinite loop. We want to register the cleanup action that's initially in the Response with ResourceT. Hence:

  1. Client.responseOpen to create the Response
  2. Extract the cleanup action from the Response and register it with ResourceT.
  3. Replace the cleanup action in the Response with one that releases the registered action.
  4. (As in current master) Release the registered action if the response conduit is completely consumed.

@snoyberg
Copy link
Owner

Ah, got it, thanks for the explanation. Makes sense, and this looks good! Merging.

@snoyberg snoyberg merged commit ec7abba into snoyberg:master Aug 26, 2024
snoyberg added a commit that referenced this pull request Aug 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

http-conduit: Space leaks when using wide ResourceT
3 participants