Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nettest: Add helpers to avoid fixed host:ports in tests #1180

Closed
wants to merge 2 commits into from

Conversation

prashantv
Copy link
Contributor

Some tests cannot use port 0; in those tests, we should get a free
host:port and use that where possible.

Some tests cannot use port 0; in those tests, we should get a free
host:port and use that where possible.
@codecov
Copy link

codecov bot commented Jul 4, 2017

Codecov Report

Merging #1180 into dev will increase coverage by 0.01%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##              dev    #1180      +/-   ##
==========================================
+ Coverage   82.64%   82.66%   +0.01%     
==========================================
  Files         222      222              
  Lines       10422    10422              
==========================================
+ Hits         8613     8615       +2     
+ Misses       1460     1458       -2     
  Partials      349      349
Flag Coverage Δ
#experimental 76.76% <ø> (ø) ⬆️
#main 83.88% <ø> (+0.02%) ⬆️
Impacted Files Coverage Δ
transport/tchannel/peer.go 95.38% <0%> (+3.07%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update fe9ffa5...af752d9. Read the comment docs.

@codecov
Copy link

codecov bot commented Jul 4, 2017

Codecov Report

Merging #1180 into dev will decrease coverage by 0.02%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##              dev    #1180      +/-   ##
==========================================
- Coverage   82.64%   82.61%   -0.03%     
==========================================
  Files         222      222              
  Lines       10422    10422              
==========================================
- Hits         8613     8610       -3     
- Misses       1460     1462       +2     
- Partials      349      350       +1
Flag Coverage Δ
#experimental 76.76% <ø> (ø) ⬆️
#main 83.82% <ø> (-0.04%) ⬇️
Impacted Files Coverage Δ
transport/tchannel/peer.go 89.23% <0%> (-3.08%) ⬇️
transport/http/peer.go 90.32% <0%> (-1.62%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update fe9ffa5...af752d9. Read the comment docs.


// MustGetFreeHostPort returns a TCP host:port that is free for unit tests
// that cannot use port 0.
func MustGetFreeHostPort() string {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@abg and @breerly rejected me exposing this before, but Id want to do it again. However, this doesn't fully fix the problem - you need to retain the listener connection, because when you close, someone else can come in and use that port (i.e. there's a race condition here).

Going further, we need a way to do this across processes, i.e. lock down a port to use in two separate processes (for examples testing). I'd use a lock file but as I said this whole thing was shot down.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah unfortunately there is always a small race between closing the port and it being reused, I don't think there's any way to avoid that race though. I've used something similar in TChannel without causing any flakiness.

On Linux, I found the OS recycles ports used less recently, avoiding the race condition. It only becomes an issue when almost all the ports are used and there's not many free ports to recycle between.

Either way, it's a strictly better option than a hardcoded host:port.

Across processes is much trickier -- we almost want the coordinator to assign ports at that point.

@abhinav @breerly: Given this would be an internal test-only utility, and it's better than the alternative, thoughts on keeping this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note previous work:

#866
#897

Prashant and I have been talking offline, I'd like to fix this problem, 866 has a partial solution to fix the race too I think.

func MustGetFreeHostPort() string {
addr, err := getClosedTCPAddr()
if err != nil {
panic(err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also note I don't like using "Must" and panic here - there's no need for us to panic, we can handle the error in the caller, I don't want this style to be encouraged more in our codebase

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happy to do GetFreeHostPort(t *testing.T) instead and do require.NoError here.

I don't think we should return an error though, since it adds noise to tests where it's used with very little value.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think GetFreePort isn't in theory just a test function, ie it's a util function that happens to be used in testing, i think it's normal and typical in to do:

port, err := GetFreePort()
require.NoError(t, err)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a couple of reasons why I believe this is a test-only function:

  • In production, we should use port 0 instead of getting a free port
  • There is no race-free way to do this, hence why it definitely shouldn't be used in production.

Given that it's test only, I don't think this belongs in internal/net but in a nettest package. Similarly, since it's only intended for tests, I think taking in a *testing.T and using require.NoError will reduce noise in tests using this helper.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ya I'm fine with this. The race condition is not OK IMO, #866 had a solution. The bigger solution is to have all inbounds take a net.Listener, which #897 was trying to address.

@prashantv
Copy link
Contributor Author

Also, see previous discussion in #866

Some comments I'd like to add:

  • We should avoid using this method, tests should ideally use port 0, but not all tests can use that (e.g., the config tests in Update grpc config test ports to use nettest #1181). For those tests, this seems like a strictly better option than using fixed hostports.
  • There is a race here where the returned address may already be used. However, when we lose the race, we end up not being able to listen on the port and the test will fail, similar to what will happen right now if the port is already in use. The advantage is that restarting the test will usually fix the issue, whereas the fixed port test may require a code change to choose a different high port
  • The race happens very rarely in practice. TChannel tests have done this for a couple of years without it causing flakiness

cc @kriskowal @breerly

@bufdev
Copy link
Contributor

bufdev commented Jul 5, 2017

I don't like the race at all, it's just proverbially kicking the can, the other problem is I wanted this to be able to produce "locked" ports across processes for our examples testing. I addressed this in a previous PR #866, but it was rejected, we should have a wider conversation about this. Also see #897.

@prashantv
Copy link
Contributor Author

@peter-edge Do you have a solution for the race? If we don't have an alternate solution, then I think we don't really have any options, especially given that experience with tchannel has shown that the race is not an issue. Absolute worst case, the race will lead to returning a host:port that's closed, which ends up with the exact same behaviour as we have right now -- we'll return an error.

Ping @kriskowal

@bufdev
Copy link
Contributor

bufdev commented Jul 6, 2017 via email

@bufdev
Copy link
Contributor

bufdev commented Jul 6, 2017 via email

@abhinav
Copy link
Contributor

abhinav commented Jul 6, 2017

Did you mean to check in debug.test?

@bufdev
Copy link
Contributor

bufdev commented Jul 11, 2017

@prashantv I'd like to close this for now along with #1181 and #1097, I think we have a bit of a disagreement on the path forward but good news: 1. you're on vacation, act like it! 2. I'll be in SF in three weeks so we can hash it out then and get a solution in. I'm working towards getting us to inbox zero both both PRs and Issues - note we will solve this.

@bufdev
Copy link
Contributor

bufdev commented Sep 14, 2017

Closing for now, feel free to reopen if we want to continue on this

@bufdev bufdev closed this Sep 14, 2017
@abhinav abhinav deleted the add_nettest branch January 15, 2019 17:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants