Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC for creating TCP Reset toxic #247

Closed
wants to merge 3 commits into from

Conversation

chaosbox
Copy link
Contributor

This PR is an RFC for creating a new toxic to simulate TCP RESET (Connection reset by peer) on the connections by closing the stub Input immediately or after a timeout.

Currently, this toxic sends RST when we call the close(), as a result the unacked data is discarded by the OS. The behaviour of Close is set to discard any unsent/unacknowledged data by setting SetLinger to 0, ~= sets TCP RST flag and resets the connection. The stub output stream is dropped, since if we start sending any data and then close the connection, TCP treats this as a graceful connection close by emitting FIN-ACK. Added tests to check for syscall.ECONNRESET on the TCP Read.

@jpittis @xthexder Let me know your thoughts on this.

./toxiproxy-cli create reset_example -l 0.0.0.0:8989 -u chaosbox.io:80 && ./toxiproxy-cli toxic add reset_example -n resetTCP -t reset_peer -d -a timeout=2000

➜  toxiproxy git:(RFC-reset-conn-toxic) ✗ curl -i localhost:8989
curl: (56) Recv failure: Connection reset by peer

rfc - tcp conn reset

PS: If we want to simulate TCP RST while sending payload, think its possible when we do the toxic in the IP layer by creating raw packets, which will give us more control on the TCP flags.

@ghost ghost added the cla-needed label Feb 24, 2019
@chaosbox
Copy link
Contributor Author

Tests seem to pass in local, (go version go1.11.4 darwin/amd64)

=== RUN   TestResetToxicNoTimeout
--- PASS: TestResetToxicNoTimeout (0.00s)
=== RUN   TestResetToxicWithTimeout
--- PASS: TestResetToxicWithTimeout (0.11s)
    latency_test.go:25: [Reset after timeout] Time was correct: 105.448377ms (expected 100ms)
=== RUN   TestResetToxicWithTimeoutDownstream
--- PASS: TestResetToxicWithTimeoutDownstream (0.11s)
    latency_test.go:25: [Reset after timeout] Time was correct: 104.560697ms (expected 100ms)
=== RUN   TestSlicerToxic

Looked at TravisCI, tests seem to have passed with go1.9.7 and failed with go1.11.5. Will look into it later on.

@chaosbox
Copy link
Contributor Author

Earlier test was flaky, when the add toxic method was invoked one out of 3 tests failed in a random fashion. Drilling down, found that whenever the test failed, the update toxic was registered after the link was created, which means that the "reset_peer" toxic wasnt added to the link. Since it wasnt part of the link the test which expects "connection reset by peer" was failing.

Refactored the tests and wrote a basic test to check for both upstream and downstream toxics returning "connection reset by peer".

Copy link
Contributor

@jpittis jpittis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Simulating resets like this is really cool and I think it would be a great addition to Toxiproxy.

I have a few comments regarding implementation:

  • I don't like that we're referencing the ResetToxic from the Toxiproxy core code. (Referencing StatefulToxic is alright because it's meant to be a library toxic that other toxics can be built on top of.)

  • The ResetToxic is really not that different from the pre-existing TimeoutToxic.

I'm wondering if we could broaden the ToxicStub API to expose SetLinger to all toxics and then add a Reset bool field to the existing TimeoutToxic implementation. The real question is how to do this broadening...

Just brainstorming here:

  • Could stub.Close() take a parameter to trigger a reset?
  • Could toxics provide a callback which they can use to configure socket options?

@bbigras
Copy link

bbigras commented Jun 6, 2020

Any progress on this?

@StevenACoffman
Copy link

This is great work! I hope that you can get some traction from a reviewer soon!

@DogukanZengin
Copy link

Hello,

Any eta on this feature?

@StevenACoffman
Copy link

@sirupsen I noticed your recent maintenance of this repository, and would like to bring this PR to your attention, as we would really appreciate this addition.

@detinho
Copy link

detinho commented May 12, 2021

Is there anything blocking this to be merged? Anything we can do to help?

@dhartford
Copy link

dhartford commented Aug 30, 2021

This is great start! is there a way to enhance this kind of toxic to configure to occur after a certain amount of bytes are returned (response pathway)? that way, it would always trigger if the response is larger than the configured size.

Usecase 1: set reset connection after 10 bytes to reset it right away.
Usecase 2: set reset connection after 300 bytes (or similar example specific to your scenario) to only reset on 'larger' loads, or on loads that aren't just simple heartbeats.
(or usecase 3: set a size to simulate other proxies/loadbalancers that might also reset larger payloads)

Getting a good, repeatable connection reset toxic in place would be extremely helpful!!

@StevenACoffman
Copy link

Given that this PR (which is great as-is) has languished without maintainer review since 2019, it's unlikely that the original author will add extra bells and whistles without this getting upstreamed first.

@miry miry self-assigned this Sep 17, 2021
@miry miry mentioned this pull request Oct 8, 2021
@miry miry added this to the 2.2.0 milestone Oct 8, 2021
@miry
Copy link
Contributor

miry commented Oct 16, 2021

@chaosbox I would like merge this PR. Would be ok, if I modify your code and rebase against current master?

@miry miry mentioned this pull request Oct 16, 2021
@miry
Copy link
Contributor

miry commented Oct 16, 2021

@chaosbox I created a PR with your commits included in it. Let me know if it is ok?

@StevenACoffman
Copy link

@miry I'm sure it's fine to merge that as is. Please merge it without waiting and if @chaosbox wants any follow-up, that can be a new PR.

@miry
Copy link
Contributor

miry commented Oct 17, 2021

Released in https://github.com/Shopify/toxiproxy/releases/tag/v2.2.0

@miry miry closed this Oct 17, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants