Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GNU tests/tail-2/F-headers is failing often in the GNU CI #3765

Closed
sylvestre opened this issue Aug 1, 2022 · 5 comments · Fixed by #3798
Closed

GNU tests/tail-2/F-headers is failing often in the GNU CI #3765

sylvestre opened this issue Aug 1, 2022 · 5 comments · Fixed by #3798
Labels

Comments

@sylvestre
Copy link
Contributor

Error: GNU test failed: tests/tail-2/F-headers. tests/tail-2/F-headers is passing on 'main'. Maybe you have to rebase?

The error is:

2022-08-01T07:32:10.8281270Z FAIL: tests/tail-2/F-headers
2022-08-01T07:32:10.8281507Z ============================
2022-08-01T07:32:10.8281643Z 
2022-08-01T07:32:10.8281881Z tail: cannot open 'b' for reading: No such file or directory
2022-08-01T07:32:10.8282294Z ./tests/tail-2/F-headers.sh: a: unexpected delay?
2022-08-01T07:32:10.8282700Z tail: cannot open 'a' for reading: No such file or directory
2022-08-01T07:32:10.8283128Z tail: cannot open 'b' for reading: No such file or directory
2022-08-01T07:32:10.8283395Z ==> b <==
2022-08-01T07:32:10.8283713Z FAIL tests/tail-2/F-headers.sh (exit status: 1)
2022-08-01T07:32:10.8283900Z 

@niyaznigmatullin tried to fix it in #3737

@niyaznigmatullin
Copy link
Contributor

I think probably it's because inotify doesn't send an event sometimes. Probably, because virtual disk, or many tests running in parallel, I don't know how inotify works inside.

I think that's because, when I increased timeout from 12 seconds to 51 seconds it was still failing, so seems that event doesn't come at all.

Plus, in my fork I tried to run GNU tests workflow around 10 times (from main branch, with 12 seconds timeout), and it didn't fail even once. That's strange

@niyaznigmatullin
Copy link
Contributor

I also tried to somehow make notify crate fail to deliver the events, but it did well when creating 15000 files.

https://github.com/niyaznigmatullin/notify-stress-test/actions

Found that one in notify-rs issues saying that on some systems it doesn't deliver all the messages.

@niyaznigmatullin
Copy link
Contributor

I propose to try to increase timeout and see if it will still fail. PR #3737

@jhscheer
Copy link
Contributor

jhscheer commented Aug 5, 2022

I propose to try to increase timeout and see if it will still fail. PR #3737

The timeout is most probably not the problem. In my tests, the inotify event either triggered immediately or never.

I already spent a lot of time trying to fix this test for the CI. At this point the only thing I can think of is sprinkling some more sleep calls, e.g.: #3776

@niyaznigmatullin
Copy link
Contributor

Yes, it didn't help. As the increase before..

When you said about sleep, I remembered in https://github.com/coreutils/coreutils/blob/master/src/tail.c#L1574 there is a special case handling when the event happened after tail already tried to open the file and print its content and before inotify watcher has been set up.

And I added some sleep just before notifier

fn follow(files: &mut FileHandling, settings: &mut Settings) -> UResult<()> {
    thread::sleep(Duration::from_millis(4000));
    let mut process = platform::ProcessChecker::new(settings.pid);

and the F-headers.sh test is failing now every time I run it.

sylvestre added a commit that referenced this issue Aug 6, 2022
@sylvestre sylvestre moved this to Done in GNU Compatibility Aug 6, 2022
jhscheer added a commit to jhscheer/coreutils that referenced this issue Aug 8, 2022
There exists a race condition (RC) that can occur if changes to a path
happen after the initial print loop in `uu_tail()`, but before the
path is added to the notify-Watcher thread in `follow()`.

To minimize the window where the RC can occur, this moves starting the
Watcher thread and adding paths to it from `follow()` to the initial
print loop in `uu_tail()`.

Additionally, to make sure the RC cannot happen in
"gnu/tests/tail-2/F-headers.sh", the error message that is used as a trigger
in this test, is delayed until the path is added to the Watcher thread.
sylvestre pushed a commit that referenced this issue Aug 13, 2022
* tail: fix race condition (fix #3765)

There exists a race condition (RC) that can occur if changes to a path
happen after the initial print loop in `uu_tail()`, but before the
path is added to the notify-Watcher thread in `follow()`.

To minimize the window where the RC can occur, this moves starting the
Watcher thread and adding paths to it from `follow()` to the initial
print loop in `uu_tail()`.

Additionally, to make sure the RC cannot happen in
"gnu/tests/tail-2/F-headers.sh", the error message that is used as a trigger
in this test, is delayed until the path is added to the Watcher thread.

* build-gnu: remove workarounds for tail

Remove workarounds for "tests/tail-2/F-headers.sh" which are
(presumably) no longer needed because of the race condition fix.

* build-gnu: remove workarounds for tail

Remove workarounds for "tests/tail-2/F-headers.sh" which are
(presumably) no longer needed because of the race condition fix.

* tail: refactor to minimize chances of RC

Move "adding paths to Watcher thread" to its own loop and run this loop
before the initial tail-print-loop in order to minimize the window for
race conditions.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: Done
3 participants