Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Promote riscv64gc-unknown-linux-gnu to Tier-1 (without host tools) #3707

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

robin-randhawa-sifive
Copy link

@robin-randhawa-sifive robin-randhawa-sifive commented Oct 3, 2024

This RFC outlines the case for promoting the Rust riscv64gc-unknown-linux-gnu target to Tier-1 (without host tools) status.

Shout out to @Hoverbear, @danielsilverstone-ct for their support.

Rendered


Not promoting the target could lead to a situation where the `riscv64gc-unknown-linux-gnu` tests are no longer passing, and this could impact users.

Anecdotally, not having the Tier 1 'badge' has been seen to become an obstacle to increasing mindshare in Rust for this target. Organisations tend to associate a Tier 1 categorisation with better quality, suitability for key projects, longevity etc. With a reasonably justified Tier 1 'badge' in place, the likelihood is that such organisations will tend to pick up and promote the use of Rust in production.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should also discuss the option of having a vendor other than the rust-lang org provide the equivalent of Tier 1 support, e.g. by running CI externally and communicating equivalence properly.

That way even host tools could be tested without the infra burden of maintaining custom github runners.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That roughly happens with additional arches on Linux distros already.

@ehuss ehuss added T-compiler Relevant to the compiler team, which will review and decide on the RFC. T-infra Relevant to the infrastructure team, which will review and decide on the RFC. T-release Relevant to the release team, which will review and decide on the RFC. labels Oct 3, 2024
@programmerjake
Copy link
Member

I think there should be an explicit note in the tier list of targets that host tools are provided but they are tier 2.

@slanterns
Copy link
Contributor

@clarfonthey
Copy link
Contributor

clarfonthey commented Oct 4, 2024

I think there should be an explicit note in the tier list of targets that host tools are provided but they are tier 2.

Yeah, this greatly confused me until I saw this clarification. I was thinking that the host tools were being removed, but no, they're just not being promoted.

Honestly, I think that maybe, it might be worth just separately classifying targets and host tools on entirely different lists at least when it comes to tiers, to avoid this kind of confusion, since it appears that they can truly be independently supported. Although I guess that it wouldn't ever make sense to have host tools at a higher tier than a target because, why would you allow building from something and not building for it?

To clarify, I mean something along the lines of classifying ristv64gc-unknown-linux-gnu as "tier 1 target, tier 2 host tools," rather than "tier 1 without host tools." Or, perhaps just having two separate lists of supported targets and supported host tools.

Comment on lines +19 to +21
During discussions with users and partners, the [RISE project](https://riseproject.dev/) has received feedback from users that they would like to use Rust, but they are hesitant due to the Tier 2 status.

In the last 2 quarters, good progress has been made in understanding and filling the gaps that remain in the path to attaining [Tier 1 (without host tools)](https://doc.rust-lang.org/nightly/rustc/target-tier-policy.html#tier-1-target-policy) status for this target.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is truly the cause, then how do you explain how aarch64-apple-darwin had been tier 2 for several years and only very recently moved to tier 1, but people felt absolutely confident building on it (Zed comes to mind). That may be because there were other tier 1 aarch64 and darwin targets, but that only makes this problem worse, doesn't it? A tier 1 riscv target will be the first.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will have to defer to more experienced folks to explain that.


In general, it should be uncomplicated for contributors to build for and use a `riscv64gc-unknown-linux-gnu` emulator like `qemu`, `docker`, or `lima`. Additionally, the platform is a `*-unknown-linux-gnu` target which is generally quite well understood, contributors do not need to learn what could be an otherwise unfamiliar operating system.

This target does not place significant burdens on the project that would not be present on any other target.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not so sure.

Everyone will be expected to be able to easily contribute to maintaining RISCV targets, because all PRs are blocked on a failure for a tier 1 target's tests. Having to set up QEMU, etc. at all in order to debug a contribution failing on RISCV will be a significant and novel additional burden for many contributors, above and beyond what many will expect for contributions, and may easily block PRs. The reason the tier 1 targets that exist now are acceptable (and there is argument that some are rapidly becoming not!) is because it is relatively easy to obtain access to these machines by sheer dint of their commonality, and they usually are okay at performance. The entire reason this proposal exists in its current form is because this target, however, does not fulfill either criterion.

Yes, it is "merely software", but in this case, running the test suite has to be made as turnkey as possible: none of the existing test infra that runs in CI should be taken as "good enough". No other tier 1 target will have to always be emulated in order to effectively run its test suite.

Fortunately, there has been work on making testing easier for e.g. better testing the wasm targets, by supporting a "runtool" in our test infrastructure. This makes it possible to simply use x.py test --target wasm32-wasip1 if one sets the WASI_SDK_PATH. So, it should be relatively easy to make testing this target from a different machine host about as simple as ./x.py test --target riscv64gc-unknown-linux-gnu, and I would expect that before this is promoted to tier 1.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, it should be uncomplicated for contributors to build for and use a riscv64gc-unknown-linux-gnu emulator like qemu, docker, or lima.

+1 to what Jubilee said. If as a compiler contributor I have to build an emulator to run the test suite (especially to bless tests that fail in PR CI or full CI) for riscv specific tests or revisions, then that is a very significant burden. For current Tier 1 targets I'm lucky enough to have access to both Windows and Linux via dev-desktop and I don't need to build an emulator for those targets. Even apple-specific failures can already be a pain. I don't want to have to litter //@ ignore-riscv into our test suites more than there already exists if there are test failures that are blocking PRs but neither the PR author nor reviewer can bless the test easily.

I have to also note that many of our tests are //@ ignore-cross-compile and are not exercised on the cross-compiled target when running tests for various reasons. This means that if a Tier 1 target is emulated it will likely receive a lot less test coverage than you might expect for a Tier 1 target.

Copy link
Member

@the8472 the8472 Oct 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having to set up QEMU, etc. at all in order to debug a contribution failing on RISCV will be a significant and novel additional burden for many contributors

I think of apple OSes as a far worse target than that. There isn't even emulation available. For windows microsoft offers VMs. For other CPUs there's QEMU at least. Apple is worse than all of that.

Despite that I don't have to care about apple because there are dedicated target maintainers. I'd expect a new tier 1 target to get the same level of care from its dedicated maintainers, taking the burden off everyone else.

That said, good documentation and anything that would streamline the emulation setup and testing would be welcome.

Copy link
Member

@kennytm kennytm Oct 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think of apple OSes as a far worse target than that. There isn't even emulation available. For windows microsoft offers VMs. For other CPUs there's QEMU at least. Apple is worse than all of that.

True, you can't (legally) emulate the Apple targets. On the other hand, the user base developing for an Apple platform is huge enough that, even if you (the contributor) don't have access it is easy to find a collaborator knowing how a Mac or iPhone work to help. Same story for Windows and Linux. The point is not how easy to emulate/simulate the target, but how easy for an average contributor to understand the target-specific issues.

(Personally, without checking the maintenance status, I find it surprising that RISC-V can be promoted before WASM32.)

Copy link
Member

@workingjubilee workingjubilee Oct 10, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, what I was eliding here is that e.g. common RISCV hardware doesn't work with common Linux distros because of insufficient upstreaming of necessary kernel patches. Often, the manufacturer doesn't even ship a sufficiently updated kernel! So if a contributor DOES get a RISCV device and sets it up, they may have trouble running rustc on it. Meanwhile, Apple devices at least give me the courtesy of booting and running rustup out-of-the-box.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@workingjubilee, we'll get back ASAP with the exact host targets we've used. I would be surprised if the qemu-system-riscv64 invocation from any Tier-1 host would be any different but yes, the fundamental invocation needs to be known. I would assume that that would be reasonably discernible from the CI report itself when a failure occurs and therefore be easily invokable 'locally' but we'll ensure that that gets suitable coverage, either in the RFC text itself or some suitable proxy.

Your point is taken all the same (although speaking for myself I wouldn't use some of the terms you've chosen to make your point but that's just me).

Thanks all the same and please stand by.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

riscv64gc-unknown-linux-gnu from an x86_64-unknown-linux-gnu host?

I've run DEPLOY=1 ./src/ci/docker/run.sh riscv64gc-gnu (which is exactly what the CI uses) from aarch64-darwin and x86_64-unknown-linux-gnu several times over the last months. This uses the qemu support Rust has already existing.

I've not run the tests from a Windows host.

The experience does leave some things to be desired, for example, it's a bit hard to run a specific test filter using run.sh, and there are quite a few container build steps. I think it's fair to say this process is not super obvious and may provide barriers to contributors, I also think this problem is somewhat shared with other targets (such as armhf).

Locally, most of my RISC-V development happens within a lima VM. All tests pass even using RISC-V host tools here. However this is not the most common tool. It's (almost annoyingly) fairly easy to 'just get' a RISC-V container with the tool.

Running a RISC-V VM in QEMU is generically documented, for example, here. Users would need to have (as it documents) opensbi and some other usual QEMU dependencies. I believe a user could set target.riscv64gc-unknown-linux-gnu.qemu-rootfs in their config.toml, and have it work via x.py... if they had everything set up correctly. (The problem is: getting there)

I believe if a user has docker and binfmt configured correctly (sadly, I lack a Linux host with binfmt support) it should be possible to run docker run --platform linux/riscv64 --rm -ti docker.io/riscv64/ubuntu:24.04 and get a RISC-V container.

It's, unfortunately, not quite the ./x.py test --target "${TARGET}" we discussed recently.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I haven't tested docker run --platform linux/riscv64 --rm -ti docker.io/riscv64/ubuntu:24.04 exactly, I have made use of binfmt via a (toolbox)[https://docs.fedoraproject.org/en-US/fedora-silverblue/toolbox/] container, only requiring minimal setup for binfmt to work for RISC-V. All from a x86_64-unknown-linux-gnu host.

My usual testing over the past few months for RISC-V has been a RISC-V VM, running ubuntu, which I then test via rustc's remote testing, which I've found to be the most pain free for me. Only requiring a disk image and a readily available QEMU command, all of which can be found via ubuntu's wiki.

I'll also note that I'm running Fedora Linux, and despite running a lot of this via ubuntu, it's an almost entirely seamless experience.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of that could also be done for many of the other tier 2 targets, including arm-unknown-linux-gnueabihf that is already at the same level of CI testing that is being proposed for RISC-V, where armhf-gnu runs library tests through qemu. So to risk a slippery-slope argument, I'm not sure why we would promote RISC-V and not ARM, or potentially many others.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So to risk a slippery-slope argument, I'm not sure why we would promote RISC-V and not ARM, or potentially many others.

well riscv64gc-unknown-linux-gnu does have 4 maintainers listed in https://doc.rust-lang.org/nightly/rustc/platform-support/riscv64gc-unknown-linux-gnu.html proving the "The target maintainer team must include at least 3 developers" requirement, while arm-unknown-linux-gnueabihf does not even have a target-specific doc.

# Unresolved questions
[unresolved-questions]: #unresolved-questions

No unresolved questions or issues remain.
Copy link
Member

@workingjubilee workingjubilee Oct 3, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a few:

  • What is RISE's position about Qualcomm's proposal that the C extension... part of the proposed target definition... be dispreferred next to a different way of handling instruction packeting? Perhaps not all backers of RISE agree, but RISE Project does have Qualcomm as a member, so...
  • In general, the RISCV spec still seems to have evolutionary growing pains, like Zicsr and Zifence. What'll the future hold? Are we sure that this is actually going to be the target everyone's going to want us to have moved to tier 1 even 5 years from now?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good points, again. Thanks.

For the first one: RISE's position matches the decision made by RVIA which is that the C extension is mandatory for RVA profiles. Some vendors may continue to disagree and that is just fine - which is the very point of an open ISA. However the majority opinion stands. We are well beyond any ripples as a result. Major OS vendors et al have internalised this position and that is the way things have moved.

For the second: The ratification ready RVA23 and RVB23 profiles fundamentally aim to promote the evolution of new extensions while drawing a line in the sand for those extension 'collections' that are deemed mandatory to service specific market vertical requirements (mobile, datacenter, et al).

The evolutionary growing pains you correctly allude to are in fact encouraged so long as the base mandate stands, which is the basis of commitments made by Google et al in so far as Android goes.

My personal take - FWIW - is that a lot of thought has been put into the definition of these profiles with robust discussion with a very broad and diverse collection of industry reps - a luxury that the incumbent ISAs do not have - and I think that this shall result in net goodness.

Happy to discuss further of course.

@raw-bin
Copy link

raw-bin commented Oct 10, 2024

I think there should be an explicit note in the tier list of targets that host tools are provided but they are tier 2.

Thanks. We will follow through as appropriate.

@raw-bin
Copy link

raw-bin commented Oct 10, 2024

Better to link the previous Zulip discussion here: https://rust-lang.zulipchat.com/#narrow/stream/131828-t-compiler/topic/Imminent.20RFC.20PR.3A.20riscv64gc-unknown-linux-gnu.20to.20Tier-1.

Thanks. We will follow through as appropriate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T-compiler Relevant to the compiler team, which will review and decide on the RFC. T-infra Relevant to the infrastructure team, which will review and decide on the RFC. T-release Relevant to the release team, which will review and decide on the RFC.
Projects
None yet
Development

Successfully merging this pull request may close these issues.