-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ned/studentmodulehistory cleaner command #411
Conversation
BTW: the failing tests on Jenkins seem unconnected on the face of it, but the failures are persisting after rebasing, so there could be some kind of quantum entanglement at work. Investigating... |
smhc = StudentModuleHistoryCleaner( | ||
dry_run=options["dry_run"], | ||
) | ||
smhc.main() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought you were planning to expose the batch size and a wait time between batches at the command level. Was that not necessary given how much load this puts on the db?
(Paranoia Hat) I'm assuming that you're going to be piping output to a monster output log file -- are the ops folks giving you a warm home with lots of disk space to run this? |
And at the risk of scope creep, would it be useful to capture just how much collapsing we're doing (to find what things were doing major writes)? Or is that all moot now anyways, given the XBlock batching changes? Edit: I guess you already have this -- I was just thinking about a CSV type of thing. |
(2, "2013-07-15 15:04:11.000", 23), | ||
(3, "2013-07-15 15:04:01.000", 24), | ||
(4, "2013-07-15 15:04:00.000", 25), | ||
(5, "2013-07-15 15:04:00.000", 26), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW, this senario (where significantly later times have earlier StudentModuleHistory IDs) shouldn't ever happen in the DB.
Could you please add a couple of test cases for long runs with the same timestamp, as well as the (very common) pair of entries with the same timestamp? Those should be really common in the actual data. |
state = { | ||
'next_student_module_id': self.next_student_module_id, | ||
} | ||
with open(self.STATE_FILE, "w") as state_file: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the fact that we're opening the state file every time a problem at all on when the disk is backed by EBS? I don't think it necessarily needs changing, but I am curious if it becomes a factor when you're doing the speed test for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're only saving state once per batch, not once per id, so this should be fine.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, but with these settings, we'd still have over half a million batches. If an open takes 10ms, that's an hour and a half. Probably not worth worrying about at this point -- I'm mostly just curious where this script will actually spend its time.
That's all I can think of for now. |
…entmodulehistory records.
…h it slows them down.
👍 |
…mand Ned/studentmodulehistory cleaner command
Xqueue callback acquires lock on StudentModule to avoid race condition
Merge changes to release back to master
…lete-course-with-assets Add purge flag in delete_course openedx#411
…g-image-width Fix Zooming Image size in Studio
* FIX: adding back context usage * FIX: adding back context usage * UPD: updating requirements for recap exblock * update generic message on timed exams * Proversity/add microsite delete endpoint (openedx#402) * add delete endpoint, and tests * make sure recap instructor dash only works if there is ONE recap in a course * add recover password endpoint
* FIX: adding back context usage * FIX: adding back context usage * UPD: updating requirements for recap exblock * update generic message on timed exams * Proversity/add microsite delete endpoint (#402) * add delete endpoint, and tests * make sure recap instructor dash only works if there is ONE recap in a course * add recover password endpoint
The code seems good on my machine, have to test on large db. There's nothing here to sleep, etc, yet, until we determine it could help.
@cpennington @ormsbee tag you're it.