Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wasm32-wasi getpid() from wasi-libc #124554

Closed
wants to merge 1 commit into from

Conversation

youknowone
Copy link
Contributor

No description provided.

@rustbot
Copy link
Collaborator

rustbot commented Apr 30, 2024

r? @workingjubilee

rustbot has assigned @workingjubilee.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@rustbot rustbot added O-wasi Operating system: Wasi, Webassembly System Interface S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Apr 30, 2024
@workingjubilee
Copy link
Member

I don't think returning a useless value is any more conformant than the current implementation.

@youknowone
Copy link
Contributor Author

As I understand, WASI cannot provide getpid ever.
In that case, there could be 3 states of getpid in std wasi support.

  1. No getpid function. I actually expected this state until I meet panic because WASI actually doesn't have one.
  2. A dummy function with panic. It implicitly indicates Rust std defines getpid but the platform doesn't support it. This is current state.
  3. A dummy function provides platform function. This is the patch state.

Currently Rust std doesn't expose getpid itself but it does through std::process::id().
If it was a direct expose of function something like std::os::getpid(), I feel like 1 makes more sense. The function just doesn't exist in the platform. On the other hand std::process::id() defines how Rust makes abstract layer on it. It can define a perfectly working process id(3), or still nothing(1).
Maybe I didn't get how panic in std::process::id() is better than no-function or working dummy. This probably is because I don't understand the policy of Rust std that well.

Thank you for the review! Please feel free to close PR if it doesn't fit in.

@workingjubilee
Copy link
Member

workingjubilee commented May 2, 2024

Hmm.

Basically, the premise of my understanding is that in C, we can't realistically do something spiteful and cruel like abort the process every time someone calls a function like getpid. The language has no meaningful error-handling mechanisms. Whereas in Rust, programs expect to be no-sold when they poke things they shouldn't, because we do have (sparingly-used) error-handling mechanisms even for this case.

getpid is an edge-case where it's not clear that someone should ever expect it to mess them up, but in general it's preferable to panic because the Rust code has built-in diagnostics for this event due to our backtrace functionality. Perhaps that doesn't work well on wasi? But regardless, despite panics being truly exceptional, we have many more tools for dealing with them in Rust, including in some cases even using catch_unwind, or etc.

Currently, I'm trying to think of a situation where someone

  • calls getpid() and receives their false answer
  • does something where using that false answer is a good idea

Basically, I can't see how this doesn't end in "you do something else that's invalid on wasi", which is the case where we do in fact want to panic.

@workingjubilee
Copy link
Member

I am also slightly concerned about future evolution smoothness, if wasi ever legalizes getpid() and makes it meaningful, while people may have somehow relied on it returning 1. It would be ideal if they don't have to update their shit because they simply never had a valid codepath through getpid().

However, now that the wasi targets are known as preview targets, I think this is actually nominally okay.

cc @sunfishcode (do you want to be listed as a wasi maintainer?) @alexcrichton

I am interested to hear your perspectives on this, as you obviously have a stronger sense for why getpid was chosen to return this wrong answer and whether this API will ever be meaningful.

@alexcrichton
Copy link
Member

These are all good points, and thanks for the PR @youknowone! Personally I'd be inclined to merge this myself. The main reason for this is that calls to functions like getpid are often really deep down in a stack and are difficult to insert logic of "if wasm don't call it but everywhere else call it". What I'm thinking of is something like a logging library that attaches a pid to its JSON records or something like there, where it's purely informational in case it's useful debugging. It'd have to be understood that pids don't exist in wasm, but that's sort of domain-specific knowledge that would kick in once the debugging starts.

As for the history of the current implementation, I don't remember precisely but I believe the line of reasoning was probably:

  • If std::process::id returned an error, then WASI would return an error.
  • Otherwise the next-most-conservative thing was to panic! since folks probably won't rely on panicking behavior.

For now it's not actually possible to recover from panics in wasm due to the lack of default support for exception handling, meaning that if you're using a library which internally calls this function you're out-of-luck and won't be able to make progress.

Another point worth mentioning is that I don't think that Rust's std and wasi-libc should have divergent behavior here. They're intended to reflect the same idioms and such (similar to glibc and std-on-Linux for example) so having one return a constant and the other panic is something that should be resolved one way or another (e.g. by making Rust match C or the other way around).

More-or-less I don't think there's an obviously correct answer here. I'd lean towards matching wasi-libc because of how this low-level function seems likely to at least be used in an informational capacity deep in a library that's hard to excise for WASI.

@youknowone
Copy link
Contributor Author

I'd like to ask if std::process::id() is expected to be aligned with libc or not. If it does, matching wasi-libc will have benefits without penalty. This patch calls getpid though it is a constant return in wasi-libc. It helps Rust std::process::id() to be aligned to wasi-libc even when it changes the behavior in future.
If Rust has a plan to develop std without libc in future, this might be an unexpected incompatibility source between libc version and no-libc version. Not sure how much it will be though.

The main reason for this is that calls to functions like getpid are often really deep down in a stack

wasi-libc seemed to add it for the same reason. In rust, this would be a slightly weaker reason than wasi-libc.

getpid has a long history in libc, and was more widely used, e.g. a lot of file locking depends on it. So libc users will really want wasi-libc to have it. In rust, a part of this cases would require libc::getpid() to exist rather than std::process::id().

@workingjubilee
Copy link
Member

workingjubilee commented May 3, 2024

For now it's not actually possible to recover from panics in wasm due to the lack of default support for exception handling, meaning that if you're using a library which internally calls this function you're out-of-luck and won't be able to make progress.

Oh, that's exceedingly fun then.

Speaking of logging, I did start to think of that logging library example, and... you know, is there any chance wasi libc or something could generate a still-fake but more... "pid-looking" pid? Because if I was using logging infra distributed across wasm instances, and I keyed on the recorded pid for some reason, I would falsely correlate all of them, wouldn't I?

I suppose the alternative of 1 is more "obviously wrong" at least.

@alexcrichton
Copy link
Member

Hm so to me the plot is now thickening. Support for getpid in wasi-libc was added in WebAssembly/wasi-libc#243. That shows that this PR may not actually work as-is, so I wanted to ask @youknowone have you tested this PR locally? I ask because it looks like the getpid symbol is not in libc.a but rather a separate "you must opt-in to using this" library libwasi-emulated-getpid.a. In that sense I would change my opinion here in that Rust sort-of matches what libc does already. Rust doesn't have a great ability to remove the symbol entirely, so the next-best-thing (sort of) was a panic.

Given that wasi-libc by default doesn't support getpid I'd say that Rust should mirror that. Unfortunately Rust doesn't have a way to opt-in like wasi-libc does. To perhaps help better understand/resolve that, @youknowone could you elaborate a bit on the rationale for this PR? For example was there a library calling std::process::id() you'd like to get working?

@youknowone
Copy link
Contributor Author

Oh, no. I screwed up testing. I didn't write a small test, I tested with the whole of my project. I thought the patch was working, but I realized I was using a workaround-patched version 🤦. I am really sorry for making trouble. With original code, no compile error but a runtime error.

To perhaps help better understand/resolve that, @youknowone could you elaborate a bit on the rationale for this PR? For example was there a library calling std::process::id() you'd like to get working?

Not a library but a project I am working on. That can be easily patched, so no problem.

@alexcrichton
Copy link
Member

To confirm, with this PR I would expect a link-time error when building with a crate that uses std::process::id due to the symbol only existing in a wasi-libc library that's not linked in. Is that what you're seeing? Otherwise though without this PR a panic would be expected yeah.

If this is easy to work around for you then I think it might be best to stick to that course. One day we'll hopefully have the ability to easily remove the id function itself from the Rust standard library just for WASI!

@workingjubilee
Copy link
Member

Cool! With the libc revert up then it looks like this thread has served its purpose and I will be closing this PR then!

@youknowone youknowone deleted the wasi-getpid branch May 4, 2024 16:54
@youknowone
Copy link
Contributor Author

@alexcrichton No link-time error during build time. I don't see any error during build. I see runtime error, but not when running the code, but immediately after running wasmer:

error: Instantiation failed
╰─▶ 1: Error while importing "env"."getpid": unknown import. Expected Function(FunctionType { params: [], results: [I32] })

@alexcrichton
Copy link
Member

Ah right yes excellent point, I forgot that unresolved symbols aren't linking errors but end up getting imported. Makes sense though, as that means that the symbol was indeed unresolved (and the environment shouldn't have to provide the symbol)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
O-wasi Operating system: Wasi, Webassembly System Interface S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants