-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
131 look into setting time limit on background validation task #159
131 look into setting time limit on background validation task #159
Conversation
Coverage reportClick to see where and how coverage changed
This report was generated by python-coverage-comment-action |
src/sbl_filing_api/main.py
Outdated
@repeat_every(seconds=settings.expired_submission_check_secs, wait_first=True, logger=log) | ||
async def check_expired_submissions() -> None: | ||
await submission_repo.check_expired_submissions() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think for MVP, this is fine; but in the grand scheme of things, this should probably be part of another non-api service that deals with collector type of tasks.
food for thought, another approach we might be able to do is with task timeout wrapper, something like
async with asyncio.timeout_at(asyncio.get_running_loop().time() + timeout_in_seconds):
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah this is definitely purely an MVP implementation to handle the 'just in case' situations where an exception or scenario we didn't think about creeps up and causes the frontend to continually spin because the latest submission never gets out of the in progress state.
I went with the single task polling the db route instead of a thread for each submission that expires that particular one ( assuming that is what you're alluding to there). I've done both in the past and had less headaches with a single periodic updater. I'm good to discuss more before squashing and merging. But yes, this is a temporary solution to an issue that more robust error handling recently implemented will probably take care of but gives the frontend warm fuzzies ;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM for MVP, should revisit post-MVP
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking this would be something more at the task level that would monitor how long the process has been running. I do like this approach, though. My only question is, how's this going to behave if we have multiple instances for the container running at the same time? They'll each be polling the database at the same frequency. Chances are they'll frequently try to each set it to VALIDATION_EXPIRED
at the same time. Any potential issues with that?
I'm thinking back to a past life where we had many clients polling the same table, and sometimes updated it as well, and we'd get database deadlocks in certain scenarios, until we restructured the code.
In the long run, if we want/need to keep a process like this around, we may want to run it as a whole separate process...but not now if we don't have to. 😄
stmt = select(SubmissionDAO).filter(SubmissionDAO.state.in_(check_states)) | ||
submissions = (await session.scalars(stmt)).all() | ||
for s in submissions: | ||
if abs(s.submission_time.timestamp() - datetime.now().timestamp()) > settings.expired_submission_diff_secs: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need the abs
? Can we just flip what's subtracted from what?
if abs(s.submission_time.timestamp() - datetime.now().timestamp()) > settings.expired_submission_diff_secs: | |
if datetime.now().timestamp() - s.submission_time.timestamp() > settings.expired_submission_diff_secs: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, can we add a WARN
-level log statement in here when this happens?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup to both.
Also yeah definitely, wasn’t considering multiple pods. What I’ve done in the past is a single stateful set pod (you can singleton em) that handle this sort of thing. It would just be the scheduled task and would call an endpoint on the service (which could then be handled by any of the api pods)
There wasn’t any sort of timeout I could find to set on the background task itself. However talking with ye olde chat we could use a mixture of the background task, a wait_for task with timeout and gather them. The wait task timeout error could check, and then set, the sub state if it’s not in a good validation state. Problem with that approach is it would only handle situations where the sub actually entered into the validation flow. If someone failed outside of that (during S3 for example) that we for some reason don’t handle gracefully prior to the background task creation, we’d miss setting the expired state. Just a consideration to tying it to the background task
Still putting in the try/catch around S3 and all that, just wanted to get my start of the rewrite pushed |
src/.env.local
Outdated
FS_DOWNLOAD_CONFIG__PROTOCOL="file" | ||
EXPIRED_SUBMISSION_CHECK_SECS=60 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this one isn't needed anymore, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope thank you, removing
pyproject.toml
Outdated
@@ -23,6 +23,7 @@ async-lru = "^2.0.4" | |||
fsspec = "^2024.2.0" | |||
s3fs = "^2024.2.0" | |||
httpx = "^0.26.0" | |||
fastapi-utilities = "^0.2.0" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry, one more, 😂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and probably need to update the lock file as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Curses! removed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Closes #131