Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ned/studentmodulehistory cleaner command #411

Merged
merged 6 commits into from
Jul 19, 2013

Conversation

nedbat
Copy link
Contributor

@nedbat nedbat commented Jul 16, 2013

The code seems good on my machine, have to test on large db. There's nothing here to sleep, etc, yet, until we determine it could help.

@cpennington @ormsbee tag you're it.

@nedbat
Copy link
Contributor Author

nedbat commented Jul 16, 2013

BTW: the failing tests on Jenkins seem unconnected on the face of it, but the failures are persisting after rebasing, so there could be some kind of quantum entanglement at work. Investigating...

smhc = StudentModuleHistoryCleaner(
dry_run=options["dry_run"],
)
smhc.main()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought you were planning to expose the batch size and a wait time between batches at the command level. Was that not necessary given how much load this puts on the db?

@ormsbee
Copy link
Contributor

ormsbee commented Jul 16, 2013

(Paranoia Hat) I'm assuming that you're going to be piping output to a monster output log file -- are the ops folks giving you a warm home with lots of disk space to run this?

@ormsbee
Copy link
Contributor

ormsbee commented Jul 16, 2013

And at the risk of scope creep, would it be useful to capture just how much collapsing we're doing (to find what things were doing major writes)? Or is that all moot now anyways, given the XBlock batching changes?

Edit: I guess you already have this -- I was just thinking about a CSV type of thing.

(2, "2013-07-15 15:04:11.000", 23),
(3, "2013-07-15 15:04:01.000", 24),
(4, "2013-07-15 15:04:00.000", 25),
(5, "2013-07-15 15:04:00.000", 26),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, this senario (where significantly later times have earlier StudentModuleHistory IDs) shouldn't ever happen in the DB.

@ormsbee
Copy link
Contributor

ormsbee commented Jul 16, 2013

Could you please add a couple of test cases for long runs with the same timestamp, as well as the (very common) pair of entries with the same timestamp? Those should be really common in the actual data.

state = {
'next_student_module_id': self.next_student_module_id,
}
with open(self.STATE_FILE, "w") as state_file:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the fact that we're opening the state file every time a problem at all on when the disk is backed by EBS? I don't think it necessarily needs changing, but I am curious if it becomes a factor when you're doing the speed test for this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're only saving state once per batch, not once per id, so this should be fine.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, but with these settings, we'd still have over half a million batches. If an open takes 10ms, that's an hour and a half. Probably not worth worrying about at this point -- I'm mostly just curious where this script will actually spend its time.

@ormsbee
Copy link
Contributor

ormsbee commented Jul 16, 2013

That's all I can think of for now.

@cpennington
Copy link
Contributor

👍

nedbat added a commit that referenced this pull request Jul 19, 2013
…mand

Ned/studentmodulehistory cleaner command
@nedbat nedbat merged commit 66ca7d7 into master Jul 19, 2013
@nedbat nedbat deleted the ned/studentmodulehistory-cleaner-command branch July 19, 2013 15:33
chrisrossi pushed a commit to jazkarta/edx-platform that referenced this pull request Mar 31, 2014
Xqueue callback acquires lock on StudentModule to avoid race condition
e-kolpakov referenced this pull request in open-craft/edx-platform May 15, 2015
Merge changes to release back to master
hachiyanagi-ks added a commit to nttks/edx-platform that referenced this pull request Dec 4, 2015
hachiyanagi-ks added a commit to nttks/edx-platform that referenced this pull request Dec 4, 2015
…lete-course-with-assets

Add purge flag in delete_course openedx#411
diegomillan pushed a commit to eduNEXT/edx-platform that referenced this pull request Sep 14, 2016
…g-image-width

Fix Zooming Image size in Studio
jfavellar90 pushed a commit to eduNEXT/edx-platform that referenced this pull request Apr 11, 2018
* FIX: adding back context usage

* FIX: adding back context usage

* UPD: updating requirements for recap exblock

* update generic message on timed exams

* Proversity/add microsite delete endpoint (openedx#402)

* add delete endpoint, and tests

* make sure recap instructor dash only works if there is ONE recap in a course

* add recover password endpoint
dgamanenko referenced this pull request in raccoongang/edx-platform Jun 14, 2018
* FIX: adding back context usage

* FIX: adding back context usage

* UPD: updating requirements for recap exblock

* update generic message on timed exams

* Proversity/add microsite delete endpoint (#402)

* add delete endpoint, and tests

* make sure recap instructor dash only works if there is ONE recap in a course

* add recover password endpoint
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants