nettest: Add helpers to avoid fixed host:ports in tests #1180

prashantv · 2017-07-04T10:33:07Z

Some tests cannot use port 0; in those tests, we should get a free
host:port and use that where possible.

Some tests cannot use port 0; in those tests, we should get a free host:port and use that where possible.

codecov · 2017-07-04T10:39:31Z

Codecov Report

Merging #1180 into dev will increase coverage by 0.01%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##              dev    #1180      +/-   ##
==========================================
+ Coverage   82.64%   82.66%   +0.01%     
==========================================
  Files         222      222              
  Lines       10422    10422              
==========================================
+ Hits         8613     8615       +2     
+ Misses       1460     1458       -2     
  Partials      349      349

Flag	Coverage Δ
#experimental	`76.76% <ø> (ø)`	⬆️
#main	`83.88% <ø> (+0.02%)`	⬆️

Impacted Files	Coverage Δ
transport/tchannel/peer.go	`95.38% <0%> (+3.07%)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update fe9ffa5...af752d9. Read the comment docs.

codecov · 2017-07-04T10:39:35Z

Codecov Report

Merging #1180 into dev will decrease coverage by 0.02%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##              dev    #1180      +/-   ##
==========================================
- Coverage   82.64%   82.61%   -0.03%     
==========================================
  Files         222      222              
  Lines       10422    10422              
==========================================
- Hits         8613     8610       -3     
- Misses       1460     1462       +2     
- Partials      349      350       +1

Flag	Coverage Δ
#experimental	`76.76% <ø> (ø)`	⬆️
#main	`83.82% <ø> (-0.04%)`	⬇️

Impacted Files	Coverage Δ
transport/tchannel/peer.go	`89.23% <0%> (-3.08%)`	⬇️
transport/http/peer.go	`90.32% <0%> (-1.62%)`	⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update fe9ffa5...af752d9. Read the comment docs.

bufdev · 2017-07-04T10:40:02Z

internal/testutils/nettest/free_port.go

+
+// MustGetFreeHostPort returns a TCP host:port that is free for unit tests
+// that cannot use port 0.
+func MustGetFreeHostPort() string {


@abg and @breerly rejected me exposing this before, but Id want to do it again. However, this doesn't fully fix the problem - you need to retain the listener connection, because when you close, someone else can come in and use that port (i.e. there's a race condition here).

Going further, we need a way to do this across processes, i.e. lock down a port to use in two separate processes (for examples testing). I'd use a lock file but as I said this whole thing was shot down.

Yeah unfortunately there is always a small race between closing the port and it being reused, I don't think there's any way to avoid that race though. I've used something similar in TChannel without causing any flakiness.

On Linux, I found the OS recycles ports used less recently, avoiding the race condition. It only becomes an issue when almost all the ports are used and there's not many free ports to recycle between.

Either way, it's a strictly better option than a hardcoded host:port.

Across processes is much trickier -- we almost want the coordinator to assign ports at that point.

@abhinav @breerly: Given this would be an internal test-only utility, and it's better than the alternative, thoughts on keeping this?

Note previous work:

#866
#897

Prashant and I have been talking offline, I'd like to fix this problem, 866 has a partial solution to fix the race too I think.

bufdev · 2017-07-04T10:43:09Z

internal/testutils/nettest/free_port.go

+func MustGetFreeHostPort() string {
+	addr, err := getClosedTCPAddr()
+	if err != nil {
+		panic(err)


Also note I don't like using "Must" and panic here - there's no need for us to panic, we can handle the error in the caller, I don't want this style to be encouraged more in our codebase

Happy to do GetFreeHostPort(t *testing.T) instead and do require.NoError here.

I don't think we should return an error though, since it adds noise to tests where it's used with very little value.

I think GetFreePort isn't in theory just a test function, ie it's a util function that happens to be used in testing, i think it's normal and typical in to do:

port, err := GetFreePort() require.NoError(t, err)

There's a couple of reasons why I believe this is a test-only function:

In production, we should use port 0 instead of getting a free port

There is no race-free way to do this, hence why it definitely shouldn't be used in production.

Given that it's test only, I don't think this belongs in internal/net but in a nettest package. Similarly, since it's only intended for tests, I think taking in a *testing.T and using require.NoError will reduce noise in tests using this helper.

Ya I'm fine with this. The race condition is not OK IMO, #866 had a solution. The bigger solution is to have all inbounds take a net.Listener, which #897 was trying to address.

prashantv · 2017-07-04T10:55:46Z

Also, see previous discussion in #866

Some comments I'd like to add:

We should avoid using this method, tests should ideally use port 0, but not all tests can use that (e.g., the config tests in Update grpc config test ports to use nettest #1181). For those tests, this seems like a strictly better option than using fixed hostports.
There is a race here where the returned address may already be used. However, when we lose the race, we end up not being able to listen on the port and the test will fail, similar to what will happen right now if the port is already in use. The advantage is that restarting the test will usually fix the issue, whereas the fixed port test may require a code change to choose a different high port
The race happens very rarely in practice. TChannel tests have done this for a couple of years without it causing flakiness

cc @kriskowal @breerly

bufdev · 2017-07-05T16:08:33Z

I don't like the race at all, it's just proverbially kicking the can, the other problem is I wanted this to be able to produce "locked" ports across processes for our examples testing. I addressed this in a previous PR #866, but it was rejected, we should have a wider conversation about this. Also see #897.

prashantv · 2017-07-06T06:21:13Z

@peter-edge Do you have a solution for the race? If we don't have an alternate solution, then I think we don't really have any options, especially given that experience with tchannel has shown that the race is not an issue. Absolute worst case, the race will lead to returning a host:port that's closed, which ends up with the exact same behaviour as we have right now -- we'll return an error.

Ping @kriskowal

bufdev · 2017-07-06T11:12:02Z

Yes, see 866 and 897, lol. 866 solves it with a lock file so I could do this across processes (you could also do an in-memory map for just one process). 897 solves it by using listeners for all inbounds. If you're interested in solving this issue for yarpc-go, you might want to start with my work there since I got stuck on a tchannel bug (which you might be able to fix). This PR is a partial solution, those PRs are where to start.

…

On Thu, Jul 6, 2017 at 2:21 AM Prashant Varanasi ***@***.***> wrote: @peter-edge <https://github.com/peter-edge> Do you have a solution for the race? If we don't have an alternate solution, then I think we don't really have any options, especially given that experience with tchannel has shown that the race is not an issue. Absolute worst case, the race will lead to returning a host:port that's closed, which ends up with the *exact same behaviour* as we have right now -- we'll return an error. Ping @kriskowal <https://github.com/kriskowal> — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#1180 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AECGvC_J1u6RhwCGxLEMsLzb5dOAvaA4ks5sLHzagaJpZM4ONLn1> .

bufdev · 2017-07-06T11:16:34Z

I think the tl;dr here is if we're open to merging this now, then merge 866, or a version of 866 that does an in memory map for locking - it's trivial to do that. If you want to solve this for yarpc (note this isn't even an issue causing flaky tests right now, but I also want to see it solved), then continue my work on 897. I also again register my distaste for adding additional Must* functions, I think it sets s bad example, and uses panic where it's not intended to be used.

…

On Thu, Jul 6, 2017 at 7:11 AM Peter Edge ***@***.***> wrote: Yes, see 866 and 897, lol. 866 solves it with a lock file so I could do this across processes (you could also do an in-memory map for just one process). 897 solves it by using listeners for all inbounds. If you're interested in solving this issue for yarpc-go, you might want to start with my work there since I got stuck on a tchannel bug (which you might be able to fix). This PR is a partial solution, those PRs are where to start. On Thu, Jul 6, 2017 at 2:21 AM Prashant Varanasi ***@***.***> wrote: > @peter-edge <https://github.com/peter-edge> Do you have a solution for > the race? If we don't have an alternate solution, then I think we don't > really have any options, especially given that experience with tchannel has > shown that the race is not an issue. Absolute worst case, the race will > lead to returning a host:port that's closed, which ends up with the *exact > same behaviour* as we have right now -- we'll return an error. > > Ping @kriskowal <https://github.com/kriskowal> > > — > You are receiving this because you were mentioned. > > > Reply to this email directly, view it on GitHub > <#1180 (comment)>, or mute > the thread > <https://github.com/notifications/unsubscribe-auth/AECGvC_J1u6RhwCGxLEMsLzb5dOAvaA4ks5sLHzagaJpZM4ONLn1> > . >

abhinav · 2017-07-06T18:17:19Z

Did you mean to check in debug.test?

bufdev · 2017-07-11T10:08:25Z

@prashantv I'd like to close this for now along with #1181 and #1097, I think we have a bit of a disagreement on the path forward but good news: 1. you're on vacation, act like it! 2. I'll be in SF in three weeks so we can hash it out then and get a solution in. I'm working towards getting us to inbox zero both both PRs and Issues - note we will solve this.

bufdev · 2017-09-14T15:43:58Z

Closing for now, feel free to reopen if we want to continue on this

nettest: Add helpers to avoid fixed host:ports in tests

06c8943

Some tests cannot use port 0; in those tests, we should get a free host:port and use that where possible.

prashantv requested review from abhinav and bufdev July 4, 2017 10:33

prashantv mentioned this pull request Jul 4, 2017

Update grpc config test ports to be more random #1177

Merged

Add license + package doc

af752d9

bufdev suggested changes Jul 4, 2017

View reviewed changes

bufdev reviewed Jul 4, 2017

View reviewed changes

bufdev closed this Sep 14, 2017

abhinav deleted the add_nettest branch January 15, 2019 17:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

nettest: Add helpers to avoid fixed host:ports in tests #1180

nettest: Add helpers to avoid fixed host:ports in tests #1180

prashantv commented Jul 4, 2017

codecov bot commented Jul 4, 2017

codecov bot commented Jul 4, 2017 •

edited

Loading

bufdev Jul 4, 2017

prashantv Jul 4, 2017

bufdev Jul 4, 2017

bufdev Jul 4, 2017

prashantv Jul 4, 2017

bufdev Jul 4, 2017

prashantv Jul 4, 2017

bufdev Jul 5, 2017

prashantv commented Jul 4, 2017

bufdev commented Jul 5, 2017

prashantv commented Jul 6, 2017

bufdev commented Jul 6, 2017 via email

bufdev commented Jul 6, 2017 via email

abhinav commented Jul 6, 2017

bufdev commented Jul 11, 2017

bufdev commented Sep 14, 2017

nettest: Add helpers to avoid fixed host:ports in tests #1180

nettest: Add helpers to avoid fixed host:ports in tests #1180

Conversation

prashantv commented Jul 4, 2017

codecov bot commented Jul 4, 2017

Codecov Report

codecov bot commented Jul 4, 2017 • edited Loading

Codecov Report

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

prashantv commented Jul 4, 2017

bufdev commented Jul 5, 2017

prashantv commented Jul 6, 2017

bufdev commented Jul 6, 2017 via email

bufdev commented Jul 6, 2017 via email

abhinav commented Jul 6, 2017

bufdev commented Jul 11, 2017

bufdev commented Sep 14, 2017

codecov bot commented Jul 4, 2017 •

edited

Loading