-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix e2e flake by not consuming abort signal in apiq
query options helper
#2614
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
The latest updates on your projects. Learn more about Vercel for Git ↗︎
|
david-crespo
added a commit
to oxidecomputer/omicron
that referenced
this pull request
Dec 13, 2024
oxidecomputer/console@927c8b6...2b0c3f1 * [2b0c3f12](oxidecomputer/console@2b0c3f12) oxidecomputer/console#2621 * [c1fbf631](oxidecomputer/console@c1fbf631) oxidecomputer/console#2622 * [1dd25f63](oxidecomputer/console@1dd25f63) oxidecomputer/console#2615 * [11486f8b](oxidecomputer/console@11486f8b) oxidecomputer/console#2620 * [ada302ce](oxidecomputer/console@ada302ce) oxidecomputer/console#2488 * [bc3161ae](oxidecomputer/console@bc3161ae) minor: remove stub e2e test * [9e1d53c6](oxidecomputer/console@9e1d53c6) sentence case idp form heading * [aaf1154f](oxidecomputer/console@aaf1154f) add extra assert for instance create with additional disks test flake * [79d610dc](oxidecomputer/console@79d610dc) oxidecomputer/console#2618 * [bdbc02b7](oxidecomputer/console@bdbc02b7) oxidecomputer/console#2589 * [7a8ee0ab](oxidecomputer/console@7a8ee0ab) oxidecomputer/console#2614 * [0b5220a1](oxidecomputer/console@0b5220a1) npm audit fix * [0c873cf4](oxidecomputer/console@0c873cf4) oxidecomputer/console#2610 * [d031c8ff](oxidecomputer/console@d031c8ff) bump playwright for navigation hang fix * [dbd8545e](oxidecomputer/console@dbd8545e) oxidecomputer/console#2609 * [dc5562fe](oxidecomputer/console@dc5562fe) move error log to avoid failing to log certain errors
david-crespo
added a commit
to oxidecomputer/omicron
that referenced
this pull request
Dec 13, 2024
oxidecomputer/console@927c8b6...c1ebd8d * [c1ebd8d9](oxidecomputer/console@c1ebd8d9) take inventory switches tab back out, DB is empty * [2a139028](oxidecomputer/console@2a139028) tone down empty switches table message * [2b0c3f12](oxidecomputer/console@2b0c3f12) oxidecomputer/console#2621 * [c1fbf631](oxidecomputer/console@c1fbf631) oxidecomputer/console#2622 * [1dd25f63](oxidecomputer/console@1dd25f63) oxidecomputer/console#2615 * [11486f8b](oxidecomputer/console@11486f8b) oxidecomputer/console#2620 * [ada302ce](oxidecomputer/console@ada302ce) oxidecomputer/console#2488 * [bc3161ae](oxidecomputer/console@bc3161ae) minor: remove stub e2e test * [9e1d53c6](oxidecomputer/console@9e1d53c6) sentence case idp form heading * [aaf1154f](oxidecomputer/console@aaf1154f) add extra assert for instance create with additional disks test flake * [79d610dc](oxidecomputer/console@79d610dc) oxidecomputer/console#2618 * [bdbc02b7](oxidecomputer/console@bdbc02b7) oxidecomputer/console#2589 * [7a8ee0ab](oxidecomputer/console@7a8ee0ab) oxidecomputer/console#2614 * [0b5220a1](oxidecomputer/console@0b5220a1) npm audit fix * [0c873cf4](oxidecomputer/console@0c873cf4) oxidecomputer/console#2610 * [d031c8ff](oxidecomputer/console@d031c8ff) bump playwright for navigation hang fix * [dbd8545e](oxidecomputer/console@dbd8545e) oxidecomputer/console#2609 * [dc5562fe](oxidecomputer/console@dc5562fe) move error log to avoid failing to log certain errors
david-crespo
added a commit
to oxidecomputer/omicron
that referenced
this pull request
Dec 14, 2024
oxidecomputer/console@927c8b6...c1ebd8d * [c1ebd8d9](oxidecomputer/console@c1ebd8d9) take inventory switches tab back out, DB is empty * [2a139028](oxidecomputer/console@2a139028) tone down empty switches table message * [2b0c3f12](oxidecomputer/console@2b0c3f12) oxidecomputer/console#2621 * [c1fbf631](oxidecomputer/console@c1fbf631) oxidecomputer/console#2622 * [1dd25f63](oxidecomputer/console@1dd25f63) oxidecomputer/console#2615 * [11486f8b](oxidecomputer/console@11486f8b) oxidecomputer/console#2620 * [ada302ce](oxidecomputer/console@ada302ce) oxidecomputer/console#2488 * [bc3161ae](oxidecomputer/console@bc3161ae) minor: remove stub e2e test * [9e1d53c6](oxidecomputer/console@9e1d53c6) sentence case idp form heading * [aaf1154f](oxidecomputer/console@aaf1154f) add extra assert for instance create with additional disks test flake * [79d610dc](oxidecomputer/console@79d610dc) oxidecomputer/console#2618 * [bdbc02b7](oxidecomputer/console@bdbc02b7) oxidecomputer/console#2589 * [7a8ee0ab](oxidecomputer/console@7a8ee0ab) oxidecomputer/console#2614 * [0b5220a1](oxidecomputer/console@0b5220a1) npm audit fix * [0c873cf4](oxidecomputer/console@0c873cf4) oxidecomputer/console#2610 * [d031c8ff](oxidecomputer/console@d031c8ff) bump playwright for navigation hang fix * [dbd8545e](oxidecomputer/console@dbd8545e) oxidecomputer/console#2609 * [dc5562fe](oxidecomputer/console@dc5562fe) move error log to avoid failing to log certain errors
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Short version
Passing React Query's abort signal into our API
fetch
calls means the calls get aborted when queries get canceled due to being unmounted. One inconvenient time for this to happen is when React Strict Mode does its thing in dev mode, causing an unmount and remount in the middle of a prefetch in a loader, which doesn't itself blow up becauseprefetchQuery
eats errors, but components that expect prefetched data do blow up because the prefetch failed to populate the query cache. There isn't much advantage to canceling queries, so we can fix this by just not passing the signal through.Also read this: https://tanstack.com/query/v5/docs/framework/react/guides/query-cancellation
Medium version
With #2597 we started using an options helper with
prefetchQuery
that passes the abortsignal
from React Query through into our API calls, which means they get aborted when RQ cancels a query. The existingprefetchQuery
did not pass through thesignal
.React's strict mode does everything twice (in development mode only) in order to surface things that should be cleaned up and redone between renders, and when it unmounts and remounts the firewall rules query, that causes RQ to cancel the fetch, which was causing our API call to abort, which means it wasn't in the query cache when
usePrefetchedQuery
expects it to be. I assume it only happens some of the time because of a race, though I don't understand yet what is racing. It's not a race with the request completing because lowering the mock API's random latency of 200-400ms to 50-100ms does not seem to affect the rate of failure.In any case, after reading the React Query doc on Query Cancellation, I don't think we actually ever need to pass through the abort signal, whether for prefetches or regular queries. There's very little advantage to canceling a request that's already in flight unless you're trying to avoid downloading some big response. Mutations might make more sense to cancel. We take advantage of this during image upload, for example, where we have 6 requests in flight. It makes sense to cancel outstanding ones rather than letting them complete. Mutations are unaffected by this
apiq
helper — for now we are still using the olduseApiMutation
for all mutations.The story of my pain
Beginning with #2597 (though I didn't notice for a few days) we started having a test flake in the create VPC e2e test (example failure). It was reproducible locally in multiple browsers. It failed about half the time with a prefetch error after the transition to VPC detail for a newly created VPC (line 40).
console/test/e2e/networking.e2e.ts
Lines 33 to 40 in 0b5220a
The error showed up as a failure of the firewall rules prefetch to seed the cache. The query was there after the prefetch call but stuck in
pending
:It turned out this one line from #2597 was enough to cause the problem.
See if you can spot the difference between the two implementations involved:
apiq
console/app/api/hooks.ts
Line 111 in 0b5220a
apiQueryClient.prefetchQuery
console/app/api/hooks.ts
Line 317 in 0b5220a
CancelledError
The cause was hard to figure out with
prefetchQuery
because that function eats all errors with a.catch(noop)
. To see what was actually going wrong, I changed it to afetchQuery
, which does through errors if they happen, and showed me this was aCancelledError
coming out RQ cleaning up a subscribe due tocommitDoubleInvokeEffectsInDEV
, is a strict mode thing.CancelledError
stack trace