-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unable to download a file with ADD because of Go-http-client/1.1
user agent
#5334
Comments
Looks to be the same for both buildkit and the classic builder; On Docker Desktop for mac; In one terminal docker run -it --rm -p 8080:80 nginx:alpine In another shell;
Which prints these in the logs of the nginx container:
|
I'm trying to write a PR for this feature. It is my understanding that the ADD instruction is first parsed from a Dockerfile here: buildkit/frontend/dockerfile/instructions/parse.go Lines 239 to 254 in 70deac1
The parsed data are stored in this struct: buildkit/frontend/dockerfile/instructions/commands.go Lines 183 to 193 in 70deac1
Then, it's handled by moby via: Coincidentally, I forked both moby/buildkit and moby/moby, since one parses the instruction and the other acts upon it. Does anybody know how I could test my implementation? I thought of modifying the imports during the tests, but it's not a realistic solution. |
Go-http-client/1.1
user agent
I just tried on a build of Docker 27.3 (BuildKit 0.16), but looks like this is still the case, and it's still using the default There are some related tickets to make these headers configurable; But I think it would make sense to at least set some default that's not For example Docker's own website doesn't allow; curl -A 'Go-http-client/1.1' -sI https://www.docker.com/ | head -1
HTTP/2 403
curl -A 'buildkit/0.16' -sI https://www.docker.com/ | head -1
HTTP/2 200 So trying to download a file from the website using echo -e 'FROM scratch\nADD https://www.docker.com/ /foo.html\n' | docker build -
[+] Building 0.4s (3/4) docker:default
=> [internal] load build definition from Dockerfile 0.0s
=> => transferring dockerfile: 89B 0.0s
=> [internal] load .dockerignore 0.0s
=> => transferring context: 2B 0.0s
=> ERROR [1/1] ADD https://www.docker.com/ /foo.html 0.3s
------
> [1/1] ADD https://www.docker.com/ /foo.html:
------
ERROR: failed to solve: failed to load cache key: invalid response status 403 Let me move this ticket to the |
Let me move this to the BuildKit repository |
I don't really see how buildkit specific user-agent is any better when doing generic HTTP requests that are not buildkit-specific. In registry requests, it makes sense as we are a known client for registries. In here we are just trying to work around someone's server configuration. If the server wants to block us, it's their choice. And if they want to block something else that makes requests with Go HTTP client, that other tool can trivially work around that blockage with a random/fake user-agent. For any server that needs to have some special behavior for Go HTTP client, sending the correct user-agent reflecting that this library is making the request seems the most correct option. |
The default Go user agent is becoming more common to block; similar to some Java user agents being blocked by default https://community.cloudflare.com/t/cloudflare-blocks-java-10-user-agents-by-default/374648 Is there a reason we want to advertise buildkit (or the front end) to be a generic Go application? |
There might be some privacy concerns but mainly it is just that this is a generic Go library request without anything buildkit specific in there. Go user-agent may be theoretically useful for a server if that client has some behavior specific to the implementation, but buildkit user-agent is meaningless to a random server answering to plain
That is fine and we shouldn't try to outsmart them. Some websites want to target only human users via browsers and that is their choice. That being said, I don't think this is the most important thing in BuildKit behavior. If you think it is important that a different user-agent should be used for this request, feel free to send a PR. |
IMO, I don't think we should switch the default, but it should be configurable by LLB (I think we've discussed this elsewhere, with the ability to set arbitrary headers). If there's a reason to switch the default, we should do so in a backwards compat way (using a capability, to avoid breakage, since even though unlikely, some applications/metrics gathering systems may be relying on the current behavior). |
Wonder if we could also have a buildkit conf for http client opts: [source.http]
[source.http.headers]
"User-Agent" = "foo" Also we have ProxyEnv that works with ExecOp but don't think this is extended to http source. |
The configurable angle is nice but would take a bit more work. Let's start by changing the default with this issue and use a follow-up for the more complex/configurable route. |
Description
The ADD instruction uses the user agent
go-http-client/1.1
when the source is an URL. If for some reason this user agent is blacklisted, downloading a file using ADD becomes impossible.Context
I was trying to bust a cached git repository, cloned from my company's own repositories, using ADD. Unfortunately, my company has a list of banned user agents, including
go-http-client/1.1
, that prevents me from downloading a file with this instruction.I am aware that several workarounds exist, hence this issue is not a priority, but for this use case, nothing is as simple as using ADD.
Describe the results you received:
The build fails with a message similar to
failed to load cache key: Get $URL: EOF
.Where
$URL
is the one fed to thesrc
argument of the ADD instruction.Describe the results you expected:
The file to be downloaded by the ADD instruction.
Possible solution:
I believe that if there was an optional flag
--user-agent
, to set the user agent used by ADD, it would fix the issue. Since the flag would be optional,go-http-client/1.1
would still be the default user agent.Output of
docker version
:Output of
docker info
:Additional environment details (AWS, VirtualBox, physical, etc.):
Docker images are mainly built inside WSL2.
The text was updated successfully, but these errors were encountered: