Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Have discussion to consider various technical approaches for celery-worker memory management #4638

Open
4 tasks
jason-upchurch opened this issue Sep 25, 2020 · 0 comments

Comments

@jason-upchurch
Copy link
Contributor

Summary

Related issue: #4592 revealed that there was not enough celery-worker RAM for memory-intensive loading/refresh.

That MUR is now loaded https://dev.fec.gov/data/legal/matter-under-review/7594/ by upping the memory to 4G.

The intensive query was:

SELECT 
  doc.document_id, 
  mur.case_no, 
  mur.case_type, 
  doc.filename, 
  doc.category, 
  doc.description, 
  doc.ocrtext, 
  doc.fileimage, 
  length(fileimage) as length, 
  doc.doc_order_id, 
  doc.document_date 
FROM fecmur.document doc 
INNER JOIN fecmur.cases_with_parsed_case_serial_numbers_vw mur 
ON mur.case_id = doc.case_id 
WHERE doc.case_id = '60000002999600'

The original successful manual uploads of the missing MUR to PROD and DEV used 4G:

cf run-task api "python manage.py load_current_murs -s 7594" -m 4G --name load-MUR-7594

Technical considerations:

  • We should consider upping our celery RAM to avoid this problem in the future
  • do instances support autoscaling? (Not sure if autoscaling could/would be executed at query time
  • Do we instead just want to know after a large MUR or other case has been loaded so that we can manually monitor for the WorkerLostError?
  • This also seems like a decent job for a process supervisor: Research adding a supervisor to celery #4573 where jobs can be monitored and execute corrective action.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: No status
Development

No branches or pull requests

1 participant