Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rustup-init hangs in armv7 docker container running on an arm64 Linux with reqwest backend #3122

Closed
messense opened this issue Dec 9, 2022 · 31 comments
Labels
incomplete The bug report does not have enough information

Comments

@messense
Copy link

messense commented Dec 9, 2022

Problem

Running curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh in armv7 docker container running on an arm64 Linux hangs.

Steps

Run a docker container with --platform linux/arm/v7 on an arm64 Linux, install curl and run the sh.rustup.sh script

$ docker run --rm -it --platform linux/arm/v7 ubuntu:22.04 bash
root@668fa2822782:/# apt update && apt install curl -y
root@668fa2822782:/# curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y --verbose
info: downloading installer
info: profile set to 'default'
info: default host triple is aarch64-unknown-linux-gnu
verbose: installing toolchain 'stable-aarch64-unknown-linux-gnu'
verbose: toolchain directory: '/root/.rustup/toolchains/stable-aarch64-unknown-linux-gnu'
info: syncing channel updates for 'stable-aarch64-unknown-linux-gnu'
verbose: creating temp file: /root/.rustup/tmp/dqlfqns1_reba5ms_file
verbose: downloading file from: 'https://static.rust-lang.org/dist/channel-rust-stable.toml.sha256'
verbose: downloading with reqwest

hangs forever at the verbose: downloading with reqwest step, use RUSTUP_USE_CURL=1 works fine

root@668fa2822782:/# RUSTUP_USE_CURL=1 curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | RUSTUP_USE_CURL=1 sh -s -- -y --verbose
info: downloading installer
info: profile set to 'default'
info: default host triple is aarch64-unknown-linux-gnu
verbose: installing toolchain 'stable-aarch64-unknown-linux-gnu'
verbose: toolchain directory: '/root/.rustup/toolchains/stable-aarch64-unknown-linux-gnu'
info: syncing channel updates for 'stable-aarch64-unknown-linux-gnu'
verbose: creating temp file: /root/.rustup/tmp/0lfud7winasgeiob_file
verbose: downloading file from: 'https://static.rust-lang.org/dist/channel-rust-stable.toml.sha256'
verbose: downloading with curl
verbose: deleted temp file: /root/.rustup/tmp/0lfud7winasgeiob_file
verbose: no update hash at: '/root/.rustup/update-hashes/stable-aarch64-unknown-linux-gnu'
verbose: creating temp file: /root/.rustup/tmp/zs3_t8gv_pyprmk6_file.toml
verbose: downloading file from: 'https://static.rust-lang.org/dist/channel-rust-stable.toml'
verbose: downloading with curl
verbose: checksum passed
verbose: creating temp file: /root/.rustup/tmp/zwsok2uoflekbroj_file
verbose: downloading file from: 'https://static.rust-lang.org/dist/channel-rust-stable.toml.asc'
verbose: downloading with curl
verbose: deleted temp file: /root/.rustup/tmp/zwsok2uoflekbroj_file
verbose: Good signature from on https://static.rust-lang.org/dist/channel-rust-stable.toml from:
verbose: from builtin Rust release key
verbose:   RSAEncryptSign/85AB96E6-FA1BE5FE - Rust Language (Tag and Release Signing Key) <[email protected]>
verbose:   Fingerprint: 108F 6620 5EAE B0AA A8DD 5E1C 85AB 96E6 FA1B E5FE
verbose: deleted temp file: /root/.rustup/tmp/zs3_t8gv_pyprmk6_file.toml
info: latest update on 2022-11-03, rust version 1.65.0 (897e37553 2022-11-02)
info: downloading component 'cargo'
verbose: creating Download Directory directory: '/root/.rustup/downloads'
verbose: downloading file from: 'https://static.rust-lang.org/dist/2022-11-03/cargo-1.65.0-aarch64-unknown-linux-gnu.tar.xz'
verbose: downloading with curl
verbose: checksum passed
info: downloading component 'clippy'
verbose: downloading file from: 'https://static.rust-lang.org/dist/2022-11-03/clippy-1.65.0-aarch64-unknown-linux-gnu.tar.xz'
verbose: downloading with curl
verbose: checksum passed
info: downloading component 'rust-docs'
verbose: downloading file from: 'https://static.rust-lang.org/dist/2022-11-03/rust-docs-1.65.0-aarch64-unknown-linux-gnu.tar.xz'
verbose: downloading with curl
verbose: checksum passed
info: downloading component 'rust-std'
verbose: downloading file from: 'https://static.rust-lang.org/dist/2022-11-03/rust-std-1.65.0-aarch64-unknown-linux-gnu.tar.xz'
verbose: downloading with curl
verbose: checksum passed
info: downloading component 'rustc'
verbose: downloading file from: 'https://static.rust-lang.org/dist/2022-11-03/rustc-1.65.0-aarch64-unknown-linux-gnu.tar.xz'
verbose: downloading with curl
 55.9 MiB /  79.4 MiB ( 70 %)   0 B/s in  1s ETA: Unknown^C

Possible Solution(s)

No response

Notes

No response

Rustup version

rustup-init 1.25.1 (bb60b1e89 2022-07-12)

Installed toolchains

N/A
@rbtcollins
Copy link
Contributor

Do they select the same sort of connection to the host? e.g. ipv4 or ipv6 ?

@rbtcollins
Copy link
Contributor

I chatted with @kinnison and he sugggests that the failure is due to reqwest, which has a TLS implementation which uses per-CPU features, getting the wrong CPU type from your /proc/cpuinfo. Then the actual CPU doesn't handle things and it all just burns up in fire.

A starting point would be to compare your cpuinfo to the expected one for the hardware it is running on (or qemu is emulated, if you have cross-arch docker stuff happening).

@messense
Copy link
Author

messense commented Feb 24, 2023

Host

$ cat /proc/cpuinfo
processor	: 0
BogoMIPS	: 50.00
Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm lrcpc dcpop asimddp ssbs
CPU implementer	: 0x41
CPU architecture: 8
CPU variant	: 0x3
CPU part	: 0xd0c
CPU revision	: 1

processor	: 1
BogoMIPS	: 50.00
Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm lrcpc dcpop asimddp ssbs
CPU implementer	: 0x41
CPU architecture: 8
CPU variant	: 0x3
CPU part	: 0xd0c
CPU revision	: 1
$ lscpu
Architecture:            aarch64
  CPU op-mode(s):        32-bit, 64-bit
  Byte Order:            Little Endian
CPU(s):                  2
  On-line CPU(s) list:   0,1
Vendor ID:               ARM
  Model name:            Neoverse-N1
    Model:               1
    Thread(s) per core:  1
    Core(s) per cluster: 2
    Socket(s):           -
    Cluster(s):          1
    Stepping:            r3p1
    BogoMIPS:            50.00
    Flags:               fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm lrcpc dcpop asimddp ssbs
NUMA:
  NUMA node(s):          1
  NUMA node0 CPU(s):     0,1
Vulnerabilities:
  Itlb multihit:         Not affected
  L1tf:                  Not affected
  Mds:                   Not affected
  Meltdown:              Not affected
  Mmio stale data:       Not affected
  Retbleed:              Not affected
  Spec store bypass:     Mitigation; Speculative Store Bypass disabled via prctl
  Spectre v1:            Mitigation; __user pointer sanitization
  Spectre v2:            Mitigation; CSV2, BHB
  Srbds:                 Not affected
  Tsx async abort:       Not affected

Docker armv7

$ docker run --rm -it --platform linux/arm/v7 ubuntu:22.04 cat /proc/cpuinfo
processor	: 0
BogoMIPS	: 50.00
Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm lrcpc dcpop asimddp ssbs
CPU implementer	: 0x41
CPU architecture: 8
CPU variant	: 0x3
CPU part	: 0xd0c
CPU revision	: 1

processor	: 1
BogoMIPS	: 50.00
Features	: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm lrcpc dcpop asimddp ssbs
CPU implementer	: 0x41
CPU architecture: 8
CPU variant	: 0x3
CPU part	: 0xd0c
CPU revision	: 1

@kinnison
Copy link
Contributor

That certainly looks like docker is providing the host's cpuinfo through. This is the issue we had on reqwest - seanmonstar/reqwest#642 which eventually boiled down to lxc/lxcfs#553 which is what eventually fixed it in that case - I wonder if Docker needs equivalent help.

@rochdev
Copy link

rochdev commented Mar 2, 2024

Using RUSTUP_USE_CURL=1 still hangs for me. Any other workarounds?

@rochdev
Copy link

rochdev commented Mar 2, 2024

It does look like it's able to get a bit further when using RUSTUP_USE_CURL=1, but now it gets stuck at installing cargo:

info: installing component 'cargo'
verbose: creating temp directory: /root/.rustup/tmp/linv1qi09wy09f7v_dir

It stays there until the CI job eventually times out.

@rami3l
Copy link
Member

rami3l commented Mar 3, 2024

@rochdev Does our new 1.27 version (https://internals.rust-lang.org/t/seeking-beta-testers-for-rustup-1-27-0/20352) work for you?

@rochdev
Copy link

rochdev commented Mar 4, 2024

@rami3l Not sure if I'm doing this right, but I added RUSTUP_UPDATE_ROOT=https://dev-static.rust-lang.org/rustup as an environment variable before running the command and I am still getting the same issue.

The command in question:

curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y --verbose --default-host armv7-unknown-linux-gnueabihf --default-toolchain nightly --component rust-src

Environment variables set before running the above command:

RUSTUP_USE_CURL = '1'
RUSTUP_TOOLCHAIN = 'nightly'
RUSTUP_UPDATE_ROOT = 'https://dev-static.rust-lang.org/rustup'

@rami3l
Copy link
Member

rami3l commented Mar 4, 2024

@rochdev Thanks for your report!

The thing here is that we're currently considering the removal of the cURL backend (possibly as early as in 1.28), so it's a must that we minimize the number of functionalities being broken in that change.

I happen to have an ARM64 Mac so I'll probably be able to look into this issue more deeply.

@rochdev
Copy link

rochdev commented Mar 4, 2024

The thing here is that we're currently considering the removal of the cURL backend (possibly as early as in 1.28), so it's a must that we minimize the number of functionalities being broken in that change.

Without the cURL backend rustup hangs even sooner, basically even before it starts downloading any dependencies. Using cURL allows it to at least get past the downloads after which it hangs at trying to install cargo.

I happen to have an ARM64 Mac so I'll probably be able to look into this issue more deeply.

I just tried locally on an M1 Mac and it works properly. The issue seems to be isolated to Linux aarch64 hosts.

@rami3l
Copy link
Member

rami3l commented Mar 4, 2024

I just tried locally on an M1 Mac and it works properly. The issue seems to be isolated to Linux aarch64 hosts.

I meant to say that docker machine on ARM64 Macs should also count as a Linux aarch64 host. I'll see if I can reproduce this issue over there.

Oops, ARMv7 support is not available on ARM64 Macs (https://news.ycombinator.com/item?id=27278019), my bad.

@rochdev
Copy link

rochdev commented Mar 4, 2024

I tried to disable cURL and use the default reqwest backend instead with the 1.27 beta, and it also didn't change anything compared to before.

Here is the output:

info: downloading installer
info: profile set to 'default'
info: setting default host triple to armv7-unknown-linux-gnueabihf
verbose: creating update-hash directory: '/root/.rustup/update-hashes'
verbose: installing toolchain 'nightly-armv7-unknown-linux-gnueabihf'
verbose: toolchain directory: '/root/.rustup/toolchains/nightly-armv7-unknown-linux-gnueabihf'
info: syncing channel updates for 'nightly-armv7-unknown-linux-gnueabihf'
verbose: creating temp root: /root/.rustup/tmp
verbose: creating temp file: /root/.rustup/tmp/rmn_jarvvywfunwl_file
verbose: downloading file from: 'https://static.rust-lang.org/dist/channel-rust-nightly.toml.sha256'
verbose: downloading with reqwest

@rochdev
Copy link

rochdev commented Mar 4, 2024

I tried to get the most minimal reproduction that I could, and I ended up with this which reproduces the issue:

FROM arm32v7/ubuntu:16.04

RUN apt-get update && apt-get -y install curl

RUN curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y --verbose \
  --default-host armv7-unknown-linux-gnueabihf

@rochdev
Copy link

rochdev commented Mar 4, 2024

For good measure I also tried the same Dockerfile but with Ubuntu 18.04, 20.04 and 22.04 and I see the exact same behaviour across all of them.

@djc
Copy link
Contributor

djc commented Mar 6, 2024

Reported this against reqwest after checking with its maintainer that this is not a known issue:

seanmonstar/reqwest#2157

@rochdev
Copy link

rochdev commented Mar 6, 2024

@djc I'm not sure it's actually an issue with reqwest though. While it hangs on reqwest when reqwest is used, it still hangs when cURL is used, but at the install step after the downloads have completed. So it seems to me like either something rustup or rust is doing that doesn't work with that setup.

@djc
Copy link
Contributor

djc commented Mar 6, 2024

@rochdev that's fair... would still like to figure out why reqwest fails to download here.

@rochdev
Copy link

rochdev commented Mar 6, 2024

@djc For sure, the more eyes on this the better, but something tells me it might be the same thing that makes both hang 🤔 rustup hangs and reqwest hangs, both of which are in Rust, yet cURL works perfectly every time. Maybe there is an issue with Rust itself, or some other library than reqwest?

@djc
Copy link
Contributor

djc commented Mar 7, 2024

@rochdev to be clear, it could definitely be that there is an issue in rustup still. If you're able to dig in more, that would be great. Maybe try enabling trace-level logging and see if you can pinpoint where the hang is happening?

@rochdev
Copy link

rochdev commented Mar 7, 2024

@djc Can you provide more detailed steps on how to capture the additional information you're looking for? I tried using RUST_LOG=trace but the output is the same.

@djc
Copy link
Contributor

djc commented Mar 7, 2024

Try using RUST_LOG=trace?

@rochdev
Copy link

rochdev commented Mar 7, 2024

@djc Sorry yes that's what I tried, edited.

@djc
Copy link
Contributor

djc commented Mar 8, 2024

@rami3l do you know if release builds are built with otel support built in?

@rami3l
Copy link
Member

rami3l commented Mar 8, 2024

@djc No, I don't believe so, I'm afraid a custom build is required:

rustup/ci/run.bash

Lines 11 to 26 in b4b9a2e

FEATURES=('--no-default-features' '--features' 'curl-backend,reqwest-backend,reqwest-default-tls')
case "$(uname -s)" in
*NT* ) ;; # Windows NT
* ) FEATURES+=('--features' 'vendored-openssl') ;;
esac
case "$TARGET" in
# these platforms aren't supported by ring:
powerpc* ) ;;
mips* ) ;;
riscv* ) ;;
s390x* ) ;;
aarch64-pc-windows-msvc ) ;;
# default case, build with rustls enabled
* ) FEATURES+=('--features' 'reqwest-rustls-tls') ;;
esac

@djc
Copy link
Contributor

djc commented Mar 11, 2024

@rochdev would you be able to build your own with --features otel and try it with that?

@rochdev
Copy link

rochdev commented Mar 11, 2024

@rochdev would you be able to build your own with --features otel and try it with that?

@djc Where do I pass --features? Assume I know nothing about Rust, because I don't know all that much 😅

@djc
Copy link
Contributor

djc commented Mar 11, 2024

You'd have to clone this repo, run cargo build --release --target <something-armv7> --features otel and somehow splice the resulting binary (from target/release) into your Docker stuff.

@rochdev
Copy link

rochdev commented Mar 11, 2024

@djc Was there any change recently in the dev version? I can't seem to be able to reproduce even re-running builds that were clearly failing 100% of the time.

@rami3l
Copy link
Member

rami3l commented Mar 12, 2024

@djc Was there any change recently in the dev version? I can't seem to be able to reproduce even re-running builds that were clearly failing 100% of the time.

@rochdev We didn't do anything explicit on our side regarding reqwest, at least not that I know of. It could probably be a direct/transitive dependency update though.

Maybe you can use rustup --version to pinpoint for us the exact commit you built Rustup from? For example, I have this output on my machine:

> rustup --version
rustup 1.27.0+1 (46327d7ff 2024-03-11) dirty 1 modification
...

... and according to your report, neither v1.26.0 nor v1.27.0 (beta) is working for you. Is that correct?

@rochdev
Copy link

rochdev commented Apr 17, 2024

A month later I was never able to reproduce again, so I don't know what was causing the issue for us. It just started working properly one day.

@rami3l
Copy link
Member

rami3l commented Apr 17, 2024

A month later I was never able to reproduce again, so I don't know what was causing the issue for us. It just started working properly one day.

@rochdev In that case I'm closing this issue as incomplete. Please feel free to let us know if something goes wrong again on your end.

Have a nice day!

@rami3l rami3l closed this as not planned Won't fix, can't repro, duplicate, stale Apr 17, 2024
@rami3l rami3l added incomplete The bug report does not have enough information and removed bug labels Apr 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
incomplete The bug report does not have enough information
Projects
None yet
Development

No branches or pull requests

6 participants