Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Intermittent false negatives (silently exiting without any output) #53

Closed
4 tasks done
karlhorky opened this issue Sep 29, 2024 · 18 comments
Closed
4 tasks done
Labels
💪 phase/solved Post is done

Comments

@karlhorky
Copy link
Contributor

karlhorky commented Sep 29, 2024

Initial checklist

Affected packages and versions

[email protected], [email protected], [email protected]

Link to runnable example

https://github.com/karlhorky/repro-remark-lint-no-dead-urls-intermittent-silent-failures

Steps to reproduce

  1. Run the linting script multiple times in a row with the .mdx content in https://github.com/karlhorky/repro-remark-lint-no-dead-urls-intermittent-silent-failures/blob/main/src/pages/a/c/y.mdx
  2. 💥 Observe that the run results are different, with some of the runs exiting with code 0 silently without any output, indicating some kind of silent crash, see test run (the 1st run exits without any output, the 2nd run fails as it should):
    Screenshot 2024-09-29 at 14 26 12

Originally investigated in #52

Runtime: Latest Node.js LTS (v20.17.0)

Expected behavior

remark-lint-no-dead-urls checks all URLs in all specified files and shows the results

Actual behavior

remark-lint-no-dead-urls exits with code 0 and no output in some runs

Runtime

Other (please specify in steps to reproduce)

Package manager

pnpm

OS

Linux

Build and bundle tools

No response

@github-actions github-actions bot added 👋 phase/new Post is being triaged automatically 🤞 phase/open Post is being triaged manually and removed 👋 phase/new Post is being triaged automatically labels Sep 29, 2024
@wooorm
Copy link
Member

wooorm commented Sep 30, 2024

Thanks, I can reproduce!

  • For the types:

    -/** @type {Omit<import('unified-engine').Options, 'processor'>} */
    +/** @type {Omit<import('unified-engine').Preset} */
  • I would always recommend making your reproductions as small as possible: there’s a lot of skipUrlPatterns, but I can reproduce the problem without them. And the code block isn’t it either. Or x.mdx, or the folder structure, or remark-mdx.

  • It does seem to the many URLs. Less URLs seems to make it appear less often

@wooorm
Copy link
Member

wooorm commented Sep 30, 2024

I checked which URLs failed, out of the 50. The 4 ones that never came back were to https://www.youtube.com/. There were in total 6 connections to https://www.youtube.com/. That seems to indicate that youtube does some DDoS prevention

@wooorm
Copy link
Member

wooorm commented Sep 30, 2024

@wooorm
Copy link
Member

wooorm commented Sep 30, 2024

I’m thinking there’s some race condition in play as well. Perhaps YouTube is, when it detects many requests, slow in sending the HTML body. The request went fine but the body is slow and then aborted. This may be an undici bug though, that it just aborts silently, instead of throwing. No clue.

This comment has been minimized.

@wooorm
Copy link
Member

wooorm commented Sep 30, 2024

released in [email protected]; npm update should do the trick.

@karlhorky
Copy link
Contributor Author

karlhorky commented Sep 30, 2024

Hmm, tried [email protected] now in my reproduction repo in a PR, and it still silently exited in the first step (the rest of the steps reported each a consistent 15 warnings):

This PR also adds || true to enable running the same command multiple times without the workflow run failing.

Maybe still intermittently failing for another reason?

I can reduce the reproduction to simplify if it would help.

Screenshot 2024-09-30 at 18 41 09

@ChristianMurphy
Copy link
Member

I can reduce the reproduction to simplify if it would help.

Please do, thanks!

@karlhorky
Copy link
Contributor Author

karlhorky commented Oct 1, 2024

I simplified the repro now:

Screenshot 2024-10-01 at 10 59 45

Failed the first time, but then after merging to the [email protected] branch, it's harder to reproduce (a few test runs now without the silently exiting behavior).

It's not [email protected] that fixed this though, because I reproduced the behavior with [email protected] above with the more complex repro.

So maybe:

  1. The extra complexity in the reproduction also caused another, unrelated situation that causes remark-lint-no-dead-urls to fail
  2. It's a behavior of another of the services (eg. maybe archive.is - I see a 429 response there in the screenshot above) that only happens the first time after some period of inactivity.

I'll try re-running after a few minutes and few hours. And also try adding back parts of the original, more complex reproduction, in case those were indeed relevant.

@karlhorky
Copy link
Contributor Author

Yeah, it seems like the 2nd file x.mdx was indeed a relevant part of it (or at least, it causes the problem to appear right away):

Interestingly, it is now very consistent in its failures:

Screenshot 2024-10-01 at 11 27 24

@karlhorky
Copy link
Contributor Author

karlhorky commented Oct 1, 2024

Hm... one other change that I did while copying the original reproduction back, was this change:

/** @type {import('unified-engine').Preset} */
const config = {
-  plugins: ['remark-mdx', 'remark-lint', 'remark-lint-no-dead-urls'],
+  plugins: [
+    'remark-mdx',
+    [
+      'remark-lint-no-dead-urls',
+      {
+        skipUrlPatterns: [
+          'chrome://',
+          'codesandbox-link://',
+          'embedded-html-codesandbox://',
+          'embedded-css-codesandbox://',

Maybe this omission of 'remark-lint' is the problem? I thought I tested [email protected] with 'remark-lint' in the plugins, but maybe not...

Edit: No, I guess not, still intermittently exiting silently with 'remark-lint' in plugins:

Screenshot 2024-10-01 at 11 37 06

@wooorm
Copy link
Member

wooorm commented Oct 1, 2024

I did not reproduce anything else. If you want me to spend time on this, can you please please make a tiny reproduction that I can work with? Removing everything unrelated?

@karlhorky
Copy link
Contributor Author

karlhorky commented Oct 1, 2024

The dead-or-alive-1.0.3 branch (PR 1 in my reproduction repo) is the smallest reproduction which I have been able to consistently reproduce the problem with.

To me, this seems pretty minimal - but maybe you have another opinion on what should still be simplified here.

It's:

  1. 1 remark config file
  2. 1 npm script for anything I cannot specify in the remark config
  3. 2 MDX files (as mentioned above - 2 files are necessary, it seems)
  4. CI setup to run the npm script

Screenshot 2024-10-01 at 12 10 34

@karlhorky
Copy link
Contributor Author

@wooorm I'm really not sure what I should change to make this more minimal, but happy to learn if you have some ideas.

Should this issue be reopened, since it's still failing in the same way? Or I could open this as a new issue if that's preferred.

wooorm added a commit that referenced this issue Oct 8, 2024
@wooorm
Copy link
Member

wooorm commented Oct 8, 2024

Here’s how I work through your code:

  • custom package manager is not needed, builtin npm repros too
  • TS is not used, drop unified-engine
  • mdx is not needed for your repro, rename a/b/y.mdx -> a/b/y.md, c/x.mdx -> c/x.md, remove remark-mdx dependency, remove --ext mdx flag, remove remark-mdx in your config file
  • config file is not needed, replace --rc-path remark-lint-no-dead-urls.mjs with --use remark-lint --use remark-lint-no-dead-urls; remove remark-lint-no-dead-urls.mjs
  • other flags aren’t needed: remove --frail --no-stdout; specific path is not needed: ./src/pages/ -> .
  • remark-lint itself isn’t needed to reproduce either: drop --use remark-lint, drop the dependency
  • complex file structure isn’t needed, src/pages/a/b/y.md -> y.md, src/pages/c/x.md -> x.md, remove src/
  • removing the first 3 links in x.md is fine; removing all is not fine; removing the first 4 is fine.
    So, leaving it at solely:
    - [See How Fast ARPANET Spread in Just Eight Years](https://web.archive.org/web/20210506160300/https://www.smithsonianmag.com/smart-news/see-how-fast-arpanet-spread-in-just-eight-years-2341268/)
    
    (note the archive.org, perhaps relevant?)
  • removing all the URLs from y.md that aren‘t archive.org/archive.ph: it works, so that’s not it.
  • removing the first 10 instead? still repros. Next 10? Still repros. Next 10? Works again. Revert. 5? Repros. 2 more? Repros. 1 more? Repros. 1 more? Works again, revert.
    Here it gets vague and flakey. At about 10 URLs total; split over 2 files, repros.
  • alright, so we narrowed it down a lot; we know that many connections at once repros things. Now on to the main hunch for a solution: Missing files in remark-cli macOS pattern resolution #52 (comment); make the connections concurrent; adding p-all with {concurrency: 3} to in node_modules and suddenly it’s solved; reverting those markdown files so they have their 5 and like 55 URLs again, that works too. So, that’s it. concurrency: 3 and concurrency: 4 are fine. concurrency: 5 is not fine! We know the problem earlier was about 10 URLs over 2 files. concurrency: 5 for 2 files is oddly familiar. So, seems we need to pool total connections. See above commit

@karlhorky
Copy link
Contributor Author

Wow nice, that's a great simplification, thanks for the deep dive!

I think I'm not versed enough on how to use remark-cli flags, and was not aware that remark-lint could be dropped, I thought it was a dependency of remark-lint-no-dead-urls 👀

Next time that I do a repro in any of the remark / unified things, I'll be sure to try out simpler combinations of packages using these steps.

@karlhorky
Copy link
Contributor Author

I can confirm that the silent exiting behavior is no longer reproducible in my first 5 test runs, after I upgraded to [email protected] in my reproduction repo:

Screenshot 2024-10-08 at 18 06 38

@karlhorky
Copy link
Contributor Author

Upgrading to [email protected] in the monorepo with 147 MDX files, I noticed timeouts with v2:

  • [email protected] test runs for all 147 MDX files
    • ~1-4 minutes (multiple test runs over weeks)
  • [email protected] test runs for all 147 MDX files
    • timed out at 26, 48, 48, 48, 48 minutes (first 5 test runs just now, timeout caused by GitHub Actions workflow timeout-minutes config)

I'll see if this continues, and if it does, I'll try to find time to create a minimal repro.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
💪 phase/solved Post is done
Development

No branches or pull requests

3 participants