Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Not all cargo requests contains User-Agent header #8979

Closed
lightsing opened this issue Dec 16, 2020 · 6 comments
Closed

Not all cargo requests contains User-Agent header #8979

lightsing opened this issue Dec 16, 2020 · 6 comments
Labels
C-bug Category: bug S-needs-info Status: Needs more info, such as a reproduction or more background for a feature request.

Comments

@lightsing
Copy link

Problem

Steps

  1. setup mitmproxy
  2. setup proxychains4
  3. proxychains4 cargo build

Possible Solution(s)
Add missing fields.

Notes

Output of cargo version: cargo 1.48.0 (65cbdd2 2020-10-14)

rustup: 1.23.1 (3df2264a9 2020-11-30)
toolchain: stable-x86_64-unknown-linux-gnu
header-exists
header-missing

@lightsing lightsing added the C-bug Category: bug label Dec 16, 2020
@alexcrichton
Copy link
Member

Could you perhaps share the configuration you have? (e.g. .cargo/config.toml)

I've double-checked and there's only one place we create an HTTP request handle and it should always have the user-agent field set. Additionally those URLs don't look like ones that Cargo is creating, so are you sure that this is something that Cargo is doing?

@lightsing
Copy link
Author

We found this issue in crates-io-cn/crates-io-cn#14.
Our CDN used to be configured for User-Agent whitelisting (forbidding empty User-Agents and non-cargo User-Agents).
This caused some of the requests sent by cargo to encounter a 403 response, which was reproduced in several places. After examination by mitmproxy, it was found that some of the cargo requests were missing the User-Agent header.

p.s. We now turn off the CDN check for User-Agents. You are probably not being able to reproduce the identical 403 response.

cargo configuration:

[source.crates-io]
replace-with = 'cn'

[source.cn]
registry = "https://crates-io.cn/crates.io-index"

@alexcrichton
Copy link
Member

Can you try running the command with:

$ export CARGO_HTTP_DEBUG=true
$ export CARGO_LOG=cargo::ops::registry
$ cargo fetch

and gist the output? Cargo should print the headers it sends on every request which would be useful to figure out which one is missing the user-agent header.

@lightsing
Copy link
Author

@alexcrichton
Copy link
Member

Unfortunately I don't really know what's going on here, that log definitely shows Cargo not always sending a user-agent field. The HTTP handle is created here which is configured here and unconditionally sets the useragent here.

Have you perhaps configured http.user-agent to an empty string? Maybe in the environment or .cargo/config.toml? Or maybe this is a bug in curl if it's not sending the header? Sorry I don't really know how to investigate this myself.

@ehuss
Copy link
Contributor

ehuss commented Jan 7, 2021

One suspicious thing I noticed in your debug output is that the accept: */* header is using a lowercase a. The default header that libcurl adds uses an uppercase A. I don't think those requests are being generated by cargo (unless you have a build with a custom libcurl).

I'm not familiar with proxychains4, but looking at the description, I would highly suspect that (or the actual proxy) is the source of the issue.

@ehuss ehuss added the S-needs-info Status: Needs more info, such as a reproduction or more background for a feature request. label Feb 13, 2021
@Eh2406 Eh2406 closed this as completed Jun 21, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-bug Category: bug S-needs-info Status: Needs more info, such as a reproduction or more background for a feature request.
Projects
None yet
Development

No branches or pull requests

4 participants