Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net: Dial does not respond to quickly-broken IPv6 connections by falling back to IPv4 #68237

Open
oakad opened this issue Jun 28, 2024 · 17 comments
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.

Comments

@oakad
Copy link

oakad commented Jun 28, 2024

Go version

go version go1.22.4 darwin/arm64

Output of go env in your module/workspace:

GO111MODULE=''
GOARCH='arm64'
GOBIN=''
GOEXE=''
GOEXPERIMENT=''
GOFLAGS=''
GOHOSTARCH='arm64'
GOHOSTOS='darwin'
GOINSECURE=''
GOOS='darwin'
GOPROXY='https://proxy.golang.org,direct'
GOROOT='/opt/homebrew/Cellar/go/1.22.4/libexec'
GOSUMDB='sum.golang.org'
GOTMPDIR=''
GOTOOLCHAIN='auto'
GOTOOLDIR='/opt/homebrew/Cellar/go/1.22.4/libexec/pkg/tool/darwin_arm64'
GOVCS=''
GOVERSION='go1.22.4'
GCCGO='gccgo'
AR='ar'
CC='cc'
CXX='c++'
CGO_ENABLED='1'
GOWORK=''
CGO_CFLAGS='-O2 -g'
CGO_CPPFLAGS=''
CGO_CXXFLAGS='-O2 -g'
CGO_FFLAGS='-O2 -g'
CGO_LDFLAGS='-O2 -g'
PKG_CONFIG='pkg-config'
GOGCCFLAGS='-fPIC -arch arm64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -ffile-prefix-map=/var/folders/q9/qcgtwgsj0y72gr01_djqgmyw0000gq/T/go-build1826860007=/tmp/go-build -gno-record-gcc-switches -fno-common'

What did you do?

Trying to fetch a random module (all break the same):

% go get nhooyr.io/websocket
go package net: confVal.netCgo = false netGo = false
go package net: using cgo DNS resolver
go package net: hostLookupOrder(proxy.golang.org) = cgo
go: module nhooyr.io/websocket: Get "https://proxy.golang.org/nhooyr.io/websocket/@v/list": write tcp [fe80::bed0:74ff:fe64:598e%utun4]:56330->[2a00:1450:4003:80c::2011]:443: write: socket is not connected

Machine has IPv6 disabled:

% dig proxy.golang.org
; <<>> DiG 9.10.6 <<>> proxy.golang.org
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 53713
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4000
;; QUESTION SECTION:
;proxy.golang.org. IN A

;; ANSWER SECTION:
proxy.golang.org. 46 IN A 142.250.184.177

;; Query time: 366 msec
;; SERVER: 10.20.141.5#53(10.20.141.5)
;; WHEN: Fri Jun 28 21:47:33 AEST 2024
;; MSG SIZE rcvd: 61

What did you see happen?

Go get is unable to fetch a module because it's using a wrong proxy address.

What did you expect to see?

Go get should be able to fetch a module.

@seankhliao
Copy link
Member

That's not proof that IPv6 is disabled, only that dig defaults to an A (IPv4) query.

@oakad
Copy link
Author

oakad commented Jun 28, 2024

It is, I assure you.
However, there's a caveat: we have a Cisco VPN which insists on advertising an additional resolver; the said resolver is able to resolve AAAA record ("Request A records, Request AAAA records"). Basically, I've got this config:

DNS configuration

resolver #1
search domain[0] : heh
nameserver[0] : heh
nameserver[1] : heh
flags : Request A records, Request AAAA records
reach : 0x00000002 (Reachable)
order : 1

DNS configuration (for scoped queries)
resolver #1
nameserver[0] : heh
nameserver[1] : heh
if_index : 15 (en0)
flags : Scoped, Request A records
reach : 0x00000002 (Reachable)

resolver #2
search domain[0] : heh
nameserver[0] : heh
nameserver[1] : heh
if_index : 23 (utun4)
flags : Scoped, Request A records, Request AAAA records
reach : 0x00000002 (Reachable)
order : 1

Still, go should not pick the AAAA address. Or, at least, it should not do so unconditionally, because I don't think our setup is uniquely broken. :-)

@mateusz834
Copy link
Member

From the output it is clear that the cgo resolver is being used, so out of our scope.

@mateusz834 mateusz834 changed the title Go on MacOS incorrectly resolves IPv6 address even though IPv6 is not available net: MacOS incorrectly resolves IPv6 address even though IPv6 is not available Jun 28, 2024
@oakad
Copy link
Author

oakad commented Jun 28, 2024

https://danp.net/posts/macos-dns-change-in-go-1-20/

This had started happening relatively recently and I believe it is caused by changes above.

@mateusz834
Copy link
Member

Can you try forcing the go resolver and see if it helps in your case? GODEBUG=netdns=go

@oakad
Copy link
Author

oakad commented Jun 28, 2024

How do I enable both this feature and dns debug so we can see it is used for real?

@mateusz834
Copy link
Member

GODEBUG=netdns=go+2

@oakad
Copy link
Author

oakad commented Jun 28, 2024

Tough luck:

% go get nhooyr.io/websocket
go package net: confVal.netCgo = false netGo = true
go package net: GODEBUG setting forcing use of Go's resolver
go package net: hostLookupOrder(proxy.golang.org) = files,dns
go: module nhooyr.io/websocket: Get "https://proxy.golang.org/nhooyr.io/websocket/@v/list": write tcp [fe80::bed0:74ff:fe64:598e%utun4]:57052->[2a00:1450:4003:80c::2011]:443: write: socket is not connected

@oakad
Copy link
Author

oakad commented Jun 28, 2024

For reference, curl does this:

% curl -v https://proxy.golang.org/nhooyr.io/websocket/@v/list

  • Host proxy.golang.org:443 was resolved.
  • IPv6: 2a00:1450:4003:80c::2011
  • IPv4: 142.250.184.177
  • Trying 142.250.184.177:443...
  • Trying [2a00:1450:4003:80c::2011]:443...
  • Connected to proxy.golang.org (142.250.184.177) port 443
  • ALPN: curl offers h2,http/1.1

@seankhliao
Copy link
Member

What if you pass --ipv6 to curl?

In theory go's network stack should also be doing fast fallback / dual stack ipv4 and ipv6

@mateusz834
Copy link
Member

mateusz834 commented Jun 28, 2024

So the tittle is incorrect, it resolves correctly, but it fails to connect to the server when ipv6 is unavail, right?

@joedian joedian added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Jun 28, 2024
@oakad
Copy link
Author

oakad commented Jun 28, 2024

curl gets stuck when forced to use ipv6. It may be that despite underlying adapter has ipv6 disabled, the Cisco vpn client pretends it's got an ipv6 address on the utun interface. Yet it causes no issues anywhere, everything works fine apart from go.

% curl -v --ipv6 https://proxy.golang.org/nhooyr.io/websocket/@v/list

  • Host proxy.golang.org:443 was resolved.
  • IPv6: 2a00:1450:4003:80c::2011
  • IPv4: (none)
  • Trying [2a00:1450:4003:80c::2011]:443...
    ... waits for timeout

@oakad
Copy link
Author

oakad commented Jun 28, 2024

The address is of course correct, it's the action of resolving the AAAA and sticking to it rather than resolving A is incorrect. :-)

@rsc
Copy link
Contributor

rsc commented Jun 28, 2024

From the discussion so far, it sounds like:

  1. Your Mac is configured with IPv6 enabled (that is, IPv6 sockets can be created successfully).
  2. Your DNS resolver is responding to AAAA requests with IPv6 addresses.
  3. Go looks up proxy.golang.org and gets both IPv6 and IPv4 addresses.
  4. Go connects to one of the IPv6 addresses seemingly successfully. Specifically, it does the connect and then runs getsockopt(fd, SOL_SOCKET, SO_ERROR) in net/fd_unix.go and gets syscall.EISCONN, which makes it return from Dial.
  5. A future write on that connection gets syscall.ENOTCONN, as shown in the error messages.

Normally, when IPv6 addresses can't be used, the connect never succeeds (fails or times out). In your case, it appears that the connect is succeeding but then the connection breaks very quickly after that, perhaps on the first write.

Do you know of anything strange about your Mac's network or IPv6 configuration? Or some firewall that is actively breaking IPv6 connections?

For example on my Mac:

% host proxy.golang.org
proxy.golang.org has address 142.250.65.177
proxy.golang.org has IPv6 address 2607:f8b0:4006:80e::2011
proxy.golang.org mail is handled by 40 alt4.gmr-smtp-in.l.google.com.
proxy.golang.org mail is handled by 10 alt1.gmr-smtp-in.l.google.com.
proxy.golang.org mail is handled by 5 gmr-smtp-in.l.google.com.
proxy.golang.org mail is handled by 30 alt3.gmr-smtp-in.l.google.com.
proxy.golang.org mail is handled by 20 alt2.gmr-smtp-in.l.google.com.
% sudo route add -inet6 2607:f8b0:4006:80e::2011 ::1
add host 2607:f8b0:4006:80e::2011: gateway ::1
% go mod download -json rsc.io/markdown@latest
{
	"Path": "rsc.io/markdown",
	"Version": "v0.0.0-20240617154923-1f2ef1438fed",
	"Query": "latest",
	"Info": "/Users/rsc/pkg/mod/cache/download/rsc.io/markdown/@v/v0.0.0-20240617154923-1f2ef1438fed.info",
	"GoMod": "/Users/rsc/pkg/mod/cache/download/rsc.io/markdown/@v/v0.0.0-20240617154923-1f2ef1438fed.mod",
	"Zip": "/Users/rsc/pkg/mod/cache/download/rsc.io/markdown/@v/v0.0.0-20240617154923-1f2ef1438fed.zip",
	"Dir": "/Users/rsc/pkg/mod/rsc.io/[email protected]",
	"Sum": "h1:savaUwUp0YCIxdaF9EFOMB3j+TQnoLop+cNp2KPC9jk=",
	"GoModSum": "h1:rzOcjAz36Xzvwf6iaJSYXkmNbvu5XHelis1egIN0Cys="
}
% curl -v --ipv6 https://proxy.golang.org
* Host proxy.golang.org:443 was resolved.
* IPv6: 2607:f8b0:4006:80e::2011
* IPv4: (none)
*   Trying [2607:f8b0:4006:80e::2011]:443...
^C
% sudo route delete -inet6 2607:f8b0:4006:80e::2011 
delete host 2607:f8b0:4006:80e::2011
% curl -v --ipv6 https://proxy.golang.org
* Host proxy.golang.org:443 was resolved.
* IPv6: 2607:f8b0:4006:80e::2011
* IPv4: (none)
*   Trying [2607:f8b0:4006:80e::2011]:443...
* Immediate connect fail for 2607:f8b0:4006:80e::2011: No route to host
* Failed to connect to proxy.golang.org port 443 after 3 ms: Couldn't connect to server
* Closing connection
curl: (7) Failed to connect to proxy.golang.org port 443 after 3 ms: Couldn't connect to server
% 

@rsc rsc changed the title net: MacOS incorrectly resolves IPv6 address even though IPv6 is not available net: Dial does not respond to quickly-broken IPv6 connections by falling back to IPv4 Jun 28, 2024
@oakad
Copy link
Author

oakad commented Jun 30, 2024

The problem only happens with VPN enabled, I mentioned it before. The VPN in question is Cisco secure client, aka AnyConnect. I'm working with people who manage the Cisco VPN for us to see if they can change anything on their side (AnyConnect is supposed to be server side controlled, so not much can be done on the client side).

  1. Only Go breaks on our current setup; all other applications seem to work just fine. Go used to work previously, it only started breaking relatively recently (may be caused by 1.20 changes or by some changes to AnyConnect setup).
  2. Go can be made to work by using ifconfig to erase IPv6 addresses from the utun device in use by AnyConnect. This, however, has to be done on any VPN reconnection (due to how AnyConnect works).

@rittneje
Copy link

rittneje commented Aug 25, 2024

@rsc I get the same issue when trying to install things using 1.22.6 on my MacBook while on our corporate VPN (which is also Cisco AnyConnect).

My testing reveals there are two underlying issues:

  1. The IPv4 dial (which, contrary to the documentation, actually happens first, see net: ipv4 chosen first when ipv6 is available #68795) takes longer than 300 milliseconds to complete.
  2. Somehow Go believes the IPv6 dial works even though it clearly didn't really. (Kernel bug?)

Increasing the dialer's FallbackDelay (or making it negative) is enough to resolve the issue, but I have no control over what go install is doing. Would it be possible to allow overriding the 300 ms default via some env var?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
None yet
Development

No branches or pull requests

7 participants