-
-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Migrate sync_versions from an API call to a task #7548
Conversation
This should stop the API from timing out, since we aren’t running all this logic in process. This is most important for projects with lots of versions, because this logic takes a long time to run and times out the web processes. It also eats up our API processes, causing other issues with our API because of blocking. To test this, you can use the `pull` management command on a project you have checked out locally: ``` inv -e docker.manage 'pull time-test’ ```
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me! I left one comment in particular about a potential edge case and two if
block added that I think are not needed.
I need to fix up this test, but would love a quick review on the basic idea here. |
I think this is a solid PR, but needs a bunch of test refactoring. I'll jump on it soonish hopefully. |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
I have updated this PR
Still need to update the tests and test it locally. |
- Split into tags and branches data - Update with latest changes from sync_versions - Make it backwards compatible
466312e
to
5fc0164
Compare
readthedocs/projects/tasks.py
Outdated
log.exception('Sync Versions Exception') | ||
except Exception: | ||
log.exception('Unknown Sync Versions Exception') | ||
from readthedocs.builds import tasks as build_tasks |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
circular imports everywhere!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could avoid it by moving send_build_status
to this file as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
:param tags_data: List of dictionaries with ``verbose_name`` and ``identifier``. | ||
:param branches_data: Same as ``tags_data`` but for branches. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should refactor these to be a list of named tuples, I think celery should do fine serializing them, but for another PR anyway...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should probably be in a TODO
in the code, then :)
Looks like this is working fine locally, we could also put it under a feature flag if we want, since the api is still working. |
Looks like Eric can't review this PR since he opened it :D |
Mostly to avoid a circular import in #7548 (comment) Also, make projects/tasks.py more project related and small :D
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like a solid change 👍
I think a feature flag would be a good idea, so we can deploy it without much risk. Having it be backwards compat is a really good idea :)
:param tags_data: List of dictionaries with ``verbose_name`` and ``identifier``. | ||
:param branches_data: Same as ``tags_data`` but for branches. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should probably be in a TODO
in the code, then :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me!
readthedocs/projects/tasks.py
Outdated
identifier = getattr(self, 'commit', None) or self.version.identifier | ||
version_repo.checkout(identifier) | ||
|
||
def sync_versions_api(self, version_repo): | ||
def sync_versions_task(self, version_repo): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we can call this trigger_sync_versions
or trigger_sync_versions_task
, to avoid having the same name for this method than for the real task.
Co-authored-by: Manuel Kaufmann <[email protected]>
….org into sync-repo-celery
This should stop the API from timing out,
since we aren’t running all this logic in process.
This is most important for projects with lots of versions,
because this logic takes a long time to run and times out the web processes.
It also eats up our API processes,
causing other issues with our API because of blocking.
To test this, you can use the
pull
management command on a project you have checked out locally: