-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Designate a queue for running the pipeline tasks #346
Conversation
figures/tasks.py
Outdated
@@ -380,4 +380,7 @@ def run_figures_monthly_metrics(): | |||
""" | |||
logger.info('Starting figures.tasks.run_figures_monthly_metrics...') | |||
for site in get_sites(): | |||
populate_monthly_metrics_for_site.delay(site_id=site.id) | |||
populate_monthly_metrics_for_site.apply_async( | |||
kwargs={'site_id': 'edx.lms.core.high'}, # TODO: put in settings |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @melvinsoft! Please keep in mind that Figures community compatibility needs to be maintained. I think this line breaks it.
Also site_id
== edx.lms.core.high
seems incorrect.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@OmarIthawi That error is resolved. Yes, I know we'll need to think about community as well. This is one of the default Tahoe workers, but still, it should be controlled by settings, I just need some directions from John to know how to load Figures settings in the tasks.py file.
I'll wait for John approval and directions before merging of course.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is also our enterprise servers that we need to support too. Question @melvinsoft or @OmarIthawi , is edx.lms.core.high
a queue that exists in upstream Open edX? I think it does, but wanted to check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@johnbaldwin Yes, people can change it when they deploy, but those are the default ones: https://github.com/edx/configuration/blob/open-release/juniper.master/playbooks/roles/edxapp/defaults/main.yml#L653
Question, how can I include Figures settings in this file? I'd like to use the same settings I'm using in the lms_settings.py file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@melvinsoft If I understand your question right, you are asking about how to have figures.tasks
read the value 'edx.lms.core.high' into the apply_async
function?
If so, there isn't a "best practices" way now, but I'll give you a solution, and interested in comments from @OmarIthawi. So, first some history is important.
Way back in Figures early days, Figures had values injected into figures.settings
. This was the old way: https://github.com/appsembler/figures/blob/0.2.0/figures/settings.py
But @OmarIthawi didn't like and I saw his point, so when he added Hawthorn plugin support, this value injection from django.conf.settings.ENV_TOKENS['FIGURES'] into
figures.settings` went away. See here: #84
After this update, we didn't need Figures to even read from `ENV_TOKENS['FIGURES'] until Omar recently added site filtering on active sites. See these two:
- get_sites: configurable backend to filter active sites #321
- https://github.com/appsembler/figures/blob/master/figures/sites.py#L312
So there is where we stand today. Here's one option:
-
Set a default value for
FIGURES_MONTHLY_METRICS_QUEUE
infigures/settings/__init__.py
orfigures/settings/lms_production.py
(I don't have a really strong opinion here. There is no option for NOT 'lms_production' module, so everything in figures settings is 'lms_production') -
in
figures.tasks
do something similar to what Omar did infigures.sites
. Here's one option
from figures.settings.lms_production import FIGURES_MONTHLY_METRICS_QUEUE
Then in the task function that calls an apply_async
method with the queue option:
queue = settings.ENV_TOKENS['FIGURES'].get('FIGURES_MONTHLY_METRICS_QUEUE',
FIGURES_MONTHLY_METRICS_QUEUE)
populate_monthly_metrics_for_site.apply_async(site_id=site.id, queue=queue)
@melvinsoft I found a couple of tests failing for the Ginko env and the other envs (which were cancelled in this PR's test exedutions). The failing tests look straightforward to fix, should simply be two lines of code to change in the tests. For the settings test, there's an extra key in expected keys:
The other test failing is because this PR changed the method used from
|
3747b4e
to
8ed85ee
Compare
@johnbaldwin Thanks, I fixed the two tests you pointed me to, but I'm getting a new one, and I'm a bit confused. If you can point me in the right direction would be great. I'm not understanding why it's failing.
Thanks in advance! |
@melvinsoft this could be a similar issue to one you fixed in the other task by fixing the monkeypatched path in the test where the expected set of sites visited is empty
Also, you'll need to merge master branch into your PR branch. Since there's no file overlap, you should be able to rebase master into your branch instead of merge |
@johnbaldwin I'm sorry, but I'm a bit lost, can you elaborate a bit more. In the previous part it was more clear to me, I changed a function, then I fixed the monkeypatched function, but here is not so clear to me, since I'm not touching Thanks in advance. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How long does the Daily metrics take to run on a large install like Tahoe? I'm wondering whether Figures tasks may need their own dedicated custom queue. The edx.lms.core.high (High priority) queue is also used by bulk_email, so if an instructor wants to invite students or communicate to a course those tasks would be waiting for a while, I think, if they try to invite at the same time. But of course that would involve some Ansible setup.
If it works, next step are going to be to create a figures pipeline dedicated queue, and ultimately, move that queue to a dedicated server.
☝️ Oh. I should have read more carefully.
figures/tasks.py
Outdated
@@ -380,4 +380,7 @@ def run_figures_monthly_metrics(): | |||
""" | |||
logger.info('Starting figures.tasks.run_figures_monthly_metrics...') | |||
for site in get_sites(): | |||
populate_monthly_metrics_for_site.delay(site_id=site.id) | |||
populate_monthly_metrics_for_site.apply_async( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@melvinsoft Woudl you please add a comment as to why this code is explicitly setting the queue. Maybe also as a module level docstring for future Figures developers who will rewrite the daily tasks. Would be good to warn them that they need to be explicit in setting the queue for any apply_async
calls so they don't go to any old queue.
@melvinsoft I did some initial investigation and there's a couple of issues I would not expect you to see. I had to set breakpoints and inspect test execution running tox locally
I'm happy to rebase master into your PR branch So before this PR can proceed, we need to rebase master into it and then address why |
57bc76b
to
34f43cd
Compare
2bb4646
to
a68545f
Compare
Fix syntax error and PEP8 add space hardcode worker fixup! hardcode worker fix test fix another test fix test funtino fixup! fix test funtino fix monkeypatch fixup! fix monkeypatch fixup! fix monkeypatch fix test again fix test again fix test again fix test again remove settings fix routing more fixes more fixes more fixes more fixes on task more fixes and new way of routing re add , remove unused param remove unused param 2 allow figures to route tasks to specific queues undo task changes undo settings changes undo settings changes fix settings name fix settings name second time test task more concrete definition of tasks fix method fix method try apply async test order use group fix fix fix settings try our router fix char fix char try our router try our router remove comments fix flake8 fix flake8 v2 remove unused args add settings test add default None
a68545f
to
0cfdcad
Compare
@johnbaldwin @OmarIthawi @estherjsuh @thraxil @bryanlandia Can I get another another round of review from you? I've done a lot of changes in the approach here. Thanks in advance! |
for site in get_sites(): | ||
populate_monthly_metrics_for_site.delay(site_id=site.id) | ||
all_sites_jobs = group(populate_monthly_metrics_for_site.s(site.id) for site in get_sites()) | ||
all_sites_jobs.delay() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@melvinsoft These two lines I'm really interested to know if they will work. I had tried doing grouping before in early Figures development, but the tasks seemed to have not been executed or died silently
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@johnbaldwin We definitely need to try it out on staging, but I pushed the branch directly on Staging and run the Django management command and I was able to generate data.
I'm not sure how did you tried in the past, but if you look at the first line added, I'm creating a group of signatures. So it's Tasks -> Signature -> Group.
We've a design problem that I'm trying to solve with group, we've a celery task run_figures_monthly_metrics
that inside does a loop and trigger a new celery tasks for each site populate_monthly_metrics_for_site
. This is not recommended by Celery, and actually, after scratching my head for a couple of days, I found out that the tasks we trigger inside the tasks, do not respecting the routing. So I'd say let's try this approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not recommended by Celery, and actually, after scratching my head for a couple of days, I found out that the tasks we trigger inside the tasks, do not respecting the routing. So I'd say let's try this approach.
Interesting find! Let's try it and see if it works.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@melvinsoft I've looked over PR changes and didn't see anything jump out at me as an issue. Looks good! Please update the branch and merge. I look forward to seeing how it will work!
@@ -149,6 +150,13 @@ def root(*args): | |||
'FIGURES': {}, # This variable is patched by the Figures' `lms_production.py` settings module. | |||
} | |||
|
|||
PRJ_SETTINGS = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does this variable mean?
for site in get_sites(): | ||
populate_monthly_metrics_for_site.delay(site_id=site.id) | ||
all_sites_jobs = group(populate_monthly_metrics_for_site.s(site.id) for site in get_sites()) | ||
all_sites_jobs.delay() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not recommended by Celery, and actually, after scratching my head for a couple of days, I found out that the tasks we trigger inside the tasks, do not respecting the routing. So I'd say let's try this approach.
Interesting find! Let's try it and see if it works.
Moving Figures pipeline execution away from the default worker is becoming more and more a priority. This is a first pass to try to run it on a specific queue.
If it works, next step are going to be to create a figures pipeline dedicated queue, and ultimately, move that queue to a dedicated server.
TODO: