Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wasi/fs: Improve stopping condition for <ReadDir as Iterator>::next #133184

Merged

Conversation

osiewicz
Copy link
Contributor

@osiewicz osiewicz commented Nov 18, 2024

When upgrading Zed to Rust 1.82 I've encountered a test failure in our test suite. Specifically, one of our extension tests started hanging. I've tracked it down to a call to std::fs::remove_dir_all not returning when an extension is compiled with Rust 1.82 Our extension system uses WASM components, thus I've looked at the diff between 1.81 and 1.82 with respect to WASI and found 736f773

As it turned out, calling remove_dir_all from extension returned io::ErrorKind::NotFound in 1.81; the underlying issue is that the ReadDir iterator never actually terminates iteration, however since it loops around, with 1.81 we'd come across an entry second time and fail to remove it, since it would've been removed previously. With 1.82 and 736f773 it is no longer the case, thus we're seeing the hang. The tests do pass when everything but the extensions is compiled with 1.82.

This commit makes ReadDir::next adhere to readdir contract, namely it will no longer call readdir once the returned # of bytes is smaller than the size of a passed-in buffer. Previously we'd only terminate the loop if readdir returned 0.

@rustbot
Copy link
Collaborator

rustbot commented Nov 18, 2024

r? @Noratrieb

rustbot has assigned @Noratrieb.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@rustbot rustbot added O-wasi Operating system: Wasi, Webassembly System Interface S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels Nov 18, 2024
@rust-log-analyzer

This comment has been minimized.

When upgrading [Zed](zed-industries/zed#19349) to Rust 1.82 I've encountered a test failure in our test suite. Specifically, one of our extension tests started hanging. I've tracked it down to a call to std::fs::remove_dir_all not returning when an extension is compiled with Rust 1.82
Our extension system uses WASM components, thus I've looked at the diff between 1.81 and 1.82 with respect to WASI and found 736f773

As it turned out, calling remove_dir_all from extension returned io::ErrorKind::NotFound in 1.81;
the underlying issue is that the ReadDir iterator never actually terminates iteration, however since it loops around, with 1.81 we'd come across an entry second time and fail to remove it, since it would've been removed previously.
With 1.82 and 736f773 it is no longer the case, thus we're seeing the hang.

This commit makes ReadDir::next adhere to readdir contract, namely it will no longer call readdir once the returned # of bytes is smaller than the size of a passed-in buffer.
Previously we'd only terminate the loop if readdir returned 0.
@osiewicz osiewicz force-pushed the wasm-fix-infinite-loop-in-remove-dir-all branch from 954f70d to 529aae6 Compare November 18, 2024 18:19
} else {
self.cookie = Some(cookie);
}
self.cap = self.buf.len();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is correct. In case readdir doesn't fill the entire buffer because there aren't many entries, this will still set the cap to the full buffer, causing us to overflow into the unfilled part and read garbage, since the data below will include it.
Or am I misunderstanding this function? It is very confusing, so I wouldn't be surprised.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right in that self.cap should be bytes. I think though we would never read unfilled part, as readdir will read up to self.buf.len(), and unfilled Vec capacity is always filled out with 0s.
But yeah, this function as a whole if kinda confusing. I've tried to change as little of it as possible.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh, changing that condition reintroduces the original hang.. O_o

Copy link
Member

@Noratrieb Noratrieb Nov 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like the best way to fix this is to rewrite it from scratch, with some more types and enums (explicit state machine maybe?)

@Noratrieb Noratrieb added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Nov 24, 2024
@osiewicz osiewicz requested a review from Noratrieb December 3, 2024 18:27
@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Dec 3, 2024
Copy link
Member

@Noratrieb Noratrieb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, this is a lot better and seems correct

@Noratrieb
Copy link
Member

@bors r+

@bors
Copy link
Contributor

bors commented Dec 7, 2024

📌 Commit f4ab982 has been approved by Noratrieb

It is now in the queue for this repository.

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Dec 7, 2024
workingjubilee added a commit to workingjubilee/rustc that referenced this pull request Dec 8, 2024
…-remove-dir-all, r=Noratrieb

wasi/fs: Improve stopping condition for <ReadDir as Iterator>::next

When upgrading [Zed](zed-industries/zed#19349) to Rust 1.82 I've encountered a test failure in our test suite. Specifically, one of our extension tests started hanging. I've tracked it down to a call to std::fs::remove_dir_all not returning when an extension is compiled with Rust 1.82 Our extension system uses WASM components, thus I've looked at the diff between 1.81 and 1.82 with respect to WASI and found 736f773

As it turned out, calling remove_dir_all from extension returned io::ErrorKind::NotFound in 1.81; the underlying issue is that the ReadDir iterator never actually terminates iteration, however since it loops around, with 1.81 we'd come across an entry second time and fail to remove it, since it would've been removed previously. With 1.82 and 736f773 it is no longer the case, thus we're seeing the hang. The tests do pass when everything but the extensions is compiled with 1.82.

This commit makes ReadDir::next adhere to readdir contract, namely it will no longer call readdir once the returned # of bytes is smaller than the size of a passed-in buffer. Previously we'd only terminate the loop if readdir returned 0.
bors added a commit to rust-lang-ci/rust that referenced this pull request Dec 8, 2024
…kingjubilee

Rollup of 5 pull requests

Successful merges:

 - rust-lang#131669 (lint: change help for pointers to dyn types in FFI)
 - rust-lang#133184 (wasi/fs: Improve stopping condition for <ReadDir as Iterator>::next)
 - rust-lang#133265 (Add a range argument to vec.extract_if)
 - rust-lang#133456 (Add licenses + Run `cargo update`)
 - rust-lang#133767 (Add more info on type/trait mismatches for different crate versions)

Failed merges:

 - rust-lang#133522 (Don't suggest restricting bound with unstable traits on stable and mention it's unstable on nightly)

r? `@ghost`
`@rustbot` modify labels: rollup
GuillaumeGomez added a commit to GuillaumeGomez/rust that referenced this pull request Dec 9, 2024
…-remove-dir-all, r=Noratrieb

wasi/fs: Improve stopping condition for <ReadDir as Iterator>::next

When upgrading [Zed](zed-industries/zed#19349) to Rust 1.82 I've encountered a test failure in our test suite. Specifically, one of our extension tests started hanging. I've tracked it down to a call to std::fs::remove_dir_all not returning when an extension is compiled with Rust 1.82 Our extension system uses WASM components, thus I've looked at the diff between 1.81 and 1.82 with respect to WASI and found 736f773

As it turned out, calling remove_dir_all from extension returned io::ErrorKind::NotFound in 1.81; the underlying issue is that the ReadDir iterator never actually terminates iteration, however since it loops around, with 1.81 we'd come across an entry second time and fail to remove it, since it would've been removed previously. With 1.82 and 736f773 it is no longer the case, thus we're seeing the hang. The tests do pass when everything but the extensions is compiled with 1.82.

This commit makes ReadDir::next adhere to readdir contract, namely it will no longer call readdir once the returned # of bytes is smaller than the size of a passed-in buffer. Previously we'd only terminate the loop if readdir returned 0.
bors added a commit to rust-lang-ci/rust that referenced this pull request Dec 9, 2024
…llaumeGomez

Rollup of 9 pull requests

Successful merges:

 - rust-lang#133184 (wasi/fs: Improve stopping condition for <ReadDir as Iterator>::next)
 - rust-lang#133955 (Pass the arch rather than full target name to windows_registry::find_tool)
 - rust-lang#133967 ([AIX] Pass -bnoipath when adding rust upstream dynamic crates)
 - rust-lang#133970 ([AIX] Replace sa_sigaction with sa_union.__su_sigaction for AIX)
 - rust-lang#133980 ([AIX] Remove option "-n" from AIX "ln" command)
 - rust-lang#133996 (Move most tests for `-l` and `#[link(..)]` into `tests/ui/link-native-libs`)
 - rust-lang#134017 (Don't use `AsyncFnOnce::CallOnceFuture` bounds for signature deduction)
 - rust-lang#134023 (handle cygwin environment in `install::sanitize_sh`)
 - rust-lang#134053 (rustdoc: rename `issue-\d+.rs` tests to have meaningful names (part 10))

r? `@ghost`
`@rustbot` modify labels: rollup
jieyouxu added a commit to jieyouxu/rust that referenced this pull request Dec 9, 2024
…-remove-dir-all, r=Noratrieb

wasi/fs: Improve stopping condition for <ReadDir as Iterator>::next

When upgrading [Zed](zed-industries/zed#19349) to Rust 1.82 I've encountered a test failure in our test suite. Specifically, one of our extension tests started hanging. I've tracked it down to a call to std::fs::remove_dir_all not returning when an extension is compiled with Rust 1.82 Our extension system uses WASM components, thus I've looked at the diff between 1.81 and 1.82 with respect to WASI and found 736f773

As it turned out, calling remove_dir_all from extension returned io::ErrorKind::NotFound in 1.81; the underlying issue is that the ReadDir iterator never actually terminates iteration, however since it loops around, with 1.81 we'd come across an entry second time and fail to remove it, since it would've been removed previously. With 1.82 and 736f773 it is no longer the case, thus we're seeing the hang. The tests do pass when everything but the extensions is compiled with 1.82.

This commit makes ReadDir::next adhere to readdir contract, namely it will no longer call readdir once the returned # of bytes is smaller than the size of a passed-in buffer. Previously we'd only terminate the loop if readdir returned 0.
jieyouxu added a commit to jieyouxu/rust that referenced this pull request Dec 9, 2024
…-remove-dir-all, r=Noratrieb

wasi/fs: Improve stopping condition for <ReadDir as Iterator>::next

When upgrading [Zed](zed-industries/zed#19349) to Rust 1.82 I've encountered a test failure in our test suite. Specifically, one of our extension tests started hanging. I've tracked it down to a call to std::fs::remove_dir_all not returning when an extension is compiled with Rust 1.82 Our extension system uses WASM components, thus I've looked at the diff between 1.81 and 1.82 with respect to WASI and found 736f773

As it turned out, calling remove_dir_all from extension returned io::ErrorKind::NotFound in 1.81; the underlying issue is that the ReadDir iterator never actually terminates iteration, however since it loops around, with 1.81 we'd come across an entry second time and fail to remove it, since it would've been removed previously. With 1.82 and 736f773 it is no longer the case, thus we're seeing the hang. The tests do pass when everything but the extensions is compiled with 1.82.

This commit makes ReadDir::next adhere to readdir contract, namely it will no longer call readdir once the returned # of bytes is smaller than the size of a passed-in buffer. Previously we'd only terminate the loop if readdir returned 0.
bors added a commit to rust-lang-ci/rust that referenced this pull request Dec 9, 2024
Rollup of 8 pull requests

Successful merges:

 - rust-lang#133184 (wasi/fs: Improve stopping condition for <ReadDir as Iterator>::next)
 - rust-lang#133967 ([AIX] Pass -bnoipath when adding rust upstream dynamic crates)
 - rust-lang#133970 ([AIX] Replace sa_sigaction with sa_union.__su_sigaction for AIX)
 - rust-lang#133980 ([AIX] Remove option "-n" from AIX "ln" command)
 - rust-lang#133996 (Move most tests for `-l` and `#[link(..)]` into `tests/ui/link-native-libs`)
 - rust-lang#134017 (Don't use `AsyncFnOnce::CallOnceFuture` bounds for signature deduction)
 - rust-lang#134023 (handle cygwin environment in `install::sanitize_sh`)
 - rust-lang#134053 (rustdoc: rename `issue-\d+.rs` tests to have meaningful names (part 10))

r? `@ghost`
`@rustbot` modify labels: rollup
bors added a commit to rust-lang-ci/rust that referenced this pull request Dec 9, 2024
Rollup of 8 pull requests

Successful merges:

 - rust-lang#133184 (wasi/fs: Improve stopping condition for <ReadDir as Iterator>::next)
 - rust-lang#133967 ([AIX] Pass -bnoipath when adding rust upstream dynamic crates)
 - rust-lang#133970 ([AIX] Replace sa_sigaction with sa_union.__su_sigaction for AIX)
 - rust-lang#133980 ([AIX] Remove option "-n" from AIX "ln" command)
 - rust-lang#133996 (Move most tests for `-l` and `#[link(..)]` into `tests/ui/link-native-libs`)
 - rust-lang#134017 (Don't use `AsyncFnOnce::CallOnceFuture` bounds for signature deduction)
 - rust-lang#134023 (handle cygwin environment in `install::sanitize_sh`)
 - rust-lang#134053 (rustdoc: rename `issue-\d+.rs` tests to have meaningful names (part 10))

r? `@ghost`
`@rustbot` modify labels: rollup
fmease added a commit to fmease/rust that referenced this pull request Dec 9, 2024
…-remove-dir-all, r=Noratrieb

wasi/fs: Improve stopping condition for <ReadDir as Iterator>::next

When upgrading [Zed](zed-industries/zed#19349) to Rust 1.82 I've encountered a test failure in our test suite. Specifically, one of our extension tests started hanging. I've tracked it down to a call to std::fs::remove_dir_all not returning when an extension is compiled with Rust 1.82 Our extension system uses WASM components, thus I've looked at the diff between 1.81 and 1.82 with respect to WASI and found 736f773

As it turned out, calling remove_dir_all from extension returned io::ErrorKind::NotFound in 1.81; the underlying issue is that the ReadDir iterator never actually terminates iteration, however since it loops around, with 1.81 we'd come across an entry second time and fail to remove it, since it would've been removed previously. With 1.82 and 736f773 it is no longer the case, thus we're seeing the hang. The tests do pass when everything but the extensions is compiled with 1.82.

This commit makes ReadDir::next adhere to readdir contract, namely it will no longer call readdir once the returned # of bytes is smaller than the size of a passed-in buffer. Previously we'd only terminate the loop if readdir returned 0.
fmease added a commit to fmease/rust that referenced this pull request Dec 9, 2024
…-remove-dir-all, r=Noratrieb

wasi/fs: Improve stopping condition for <ReadDir as Iterator>::next

When upgrading [Zed](zed-industries/zed#19349) to Rust 1.82 I've encountered a test failure in our test suite. Specifically, one of our extension tests started hanging. I've tracked it down to a call to std::fs::remove_dir_all not returning when an extension is compiled with Rust 1.82 Our extension system uses WASM components, thus I've looked at the diff between 1.81 and 1.82 with respect to WASI and found 736f773

As it turned out, calling remove_dir_all from extension returned io::ErrorKind::NotFound in 1.81; the underlying issue is that the ReadDir iterator never actually terminates iteration, however since it loops around, with 1.81 we'd come across an entry second time and fail to remove it, since it would've been removed previously. With 1.82 and 736f773 it is no longer the case, thus we're seeing the hang. The tests do pass when everything but the extensions is compiled with 1.82.

This commit makes ReadDir::next adhere to readdir contract, namely it will no longer call readdir once the returned # of bytes is smaller than the size of a passed-in buffer. Previously we'd only terminate the loop if readdir returned 0.
bors added a commit to rust-lang-ci/rust that referenced this pull request Dec 10, 2024
Rollup of 10 pull requests

Successful merges:

 - rust-lang#131558 (Lint on combining `#[no_mangle]` and `#[export_name]`)
 - rust-lang#133122 (Add unpolished, experimental support for AFIDT (async fn in dyn trait))
 - rust-lang#133184 (wasi/fs: Improve stopping condition for <ReadDir as Iterator>::next)
 - rust-lang#133456 (Add licenses + Run `cargo update`)
 - rust-lang#133472 (Run TLS destructors for wasm32-wasip1-threads)
 - rust-lang#133853 (use vendor sources by default on dist tarballs)
 - rust-lang#133946 (coverage: Prefer to visit nodes whose predecessors have been visited)
 - rust-lang#134010 (fix ICE on type error in promoted)
 - rust-lang#134029 (coverage: Use a query to find counters/expressions that must be zero)
 - rust-lang#134071 (Configure renovatebot)

r? `@ghost`
`@rustbot` modify labels: rollup
bors added a commit to rust-lang-ci/rust that referenced this pull request Dec 10, 2024
Rollup of 10 pull requests

Successful merges:

 - rust-lang#131558 (Lint on combining `#[no_mangle]` and `#[export_name]`)
 - rust-lang#133184 (wasi/fs: Improve stopping condition for <ReadDir as Iterator>::next)
 - rust-lang#133456 (Add licenses + Run `cargo update`)
 - rust-lang#133472 (Run TLS destructors for wasm32-wasip1-threads)
 - rust-lang#133853 (use vendor sources by default on dist tarballs)
 - rust-lang#133946 (coverage: Prefer to visit nodes whose predecessors have been visited)
 - rust-lang#134010 (fix ICE on type error in promoted)
 - rust-lang#134029 (coverage: Use a query to find counters/expressions that must be zero)
 - rust-lang#134071 (Configure renovatebot)
 - rust-lang#134102 (Miscellaneous fixes for nix-dev-shell)

r? `@ghost`
`@rustbot` modify labels: rollup
@bors bors merged commit 783362d into rust-lang:master Dec 10, 2024
6 checks passed
@rustbot rustbot added this to the 1.85.0 milestone Dec 10, 2024
rust-timer added a commit to rust-lang-ci/rust that referenced this pull request Dec 10, 2024
Rollup merge of rust-lang#133184 - osiewicz:wasm-fix-infinite-loop-in-remove-dir-all, r=Noratrieb

wasi/fs: Improve stopping condition for <ReadDir as Iterator>::next

When upgrading [Zed](zed-industries/zed#19349) to Rust 1.82 I've encountered a test failure in our test suite. Specifically, one of our extension tests started hanging. I've tracked it down to a call to std::fs::remove_dir_all not returning when an extension is compiled with Rust 1.82 Our extension system uses WASM components, thus I've looked at the diff between 1.81 and 1.82 with respect to WASI and found 736f773

As it turned out, calling remove_dir_all from extension returned io::ErrorKind::NotFound in 1.81; the underlying issue is that the ReadDir iterator never actually terminates iteration, however since it loops around, with 1.81 we'd come across an entry second time and fail to remove it, since it would've been removed previously. With 1.82 and 736f773 it is no longer the case, thus we're seeing the hang. The tests do pass when everything but the extensions is compiled with 1.82.

This commit makes ReadDir::next adhere to readdir contract, namely it will no longer call readdir once the returned # of bytes is smaller than the size of a passed-in buffer. Previously we'd only terminate the loop if readdir returned 0.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
O-wasi Operating system: Wasi, Webassembly System Interface S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants