Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prefect nightly task regularly times out waiting for dashboard to say it's reloading #1261

Closed
1 task
nichhk opened this issue Jun 27, 2022 · 5 comments
Closed
1 task
Assignees
Labels
P-feature: Reports Role: Backend Related to API or other server-side work Size: 5pt Can be done in 19-30 hours Tech Stack: Old

Comments

@nichhk
Copy link
Member

nichhk commented Jun 27, 2022

Overview

See prior context here. This isn't high priority now that we are aware of this bug, but we'd like to have confidence in our error reporting (i.e., if our logs indicate an error, it's actually an error).

Here is where the dashboard handles a reload request.

Here is where the nightly task issues the reload request and waits for the dashboard to indicate that it's reloading.

Action Items

  • Try increasing the timeout to 2min
@nichhk nichhk added Size: 5pt Can be done in 19-30 hours P-feature: Reports Role: Backend Related to API or other server-side work labels Jul 19, 2022
@nichhk nichhk added this to the v2.1 Launch milestone Jul 19, 2022
@nichhk nichhk self-assigned this Aug 2, 2022
@nichhk
Copy link
Member Author

nichhk commented Aug 7, 2022

I took a closer look at this. When you manually tell the dashboard server to reload (i.e., visit dashboard_url/reload), it happens almost instantaneously, and we get the result that we're expecting.

So I'm less confident that extending the timeout will solve our issue here. There is probably a bug in the way that we are verifying the reload's success. Currently, prefect opens a browser and navigates to dashboard_url/reload, and then it waits for an HTML component with a specific ID to show up on the page. This seems quite overengineered. Previous context for this here: #935.

I think making a simple web request will be less complicated and work more reliably since we won't need to rely on a browser automation library, so I'll try that. This is what we do to reload the cache on the server.

@nichhk
Copy link
Member Author

nichhk commented Aug 7, 2022

Actually, according to #1028, we need to run a browser. There is very little detail in the bug. But apparently they tried using my approach in the previous comment first, but it didn't work. So I will continue investigating the existing solution.

@nichhk
Copy link
Member Author

nichhk commented Oct 24, 2022

It doesn't seem to be timing out anymore, probably thanks to #1379. But Prefect still says the reloading is failing, probably because it can't find the reloading message in the page content. I checked the lightsail logs, and the report server is indeed being reloaded, so I'm going to add a print statement to the Prefect task to see what content it's getting from the reload page.

@nichhk
Copy link
Member Author

nichhk commented Oct 25, 2022

It timed out again for last night's run. We might need to add a call to waitForNavigation? https://stackoverflow.com/a/58298172

@EchoProject EchoProject removed this from the v2.1 Launch milestone Dec 8, 2022
@mc759
Copy link
Member

mc759 commented Dec 13, 2022

Hey @edwinjue Just added you as a replacement for Nich! Can you help us update this issue?

Please update:

  • Progress:
  • Blockers:
  • Availability:
  • ETA:

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P-feature: Reports Role: Backend Related to API or other server-side work Size: 5pt Can be done in 19-30 hours Tech Stack: Old
Projects
Status: Done (without merge)
Development

When branches are created from issues, their pull requests are automatically linked.

5 participants