-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pipeline improvement prerequisites #427
Changes from 6 commits
926cd7c
83687ca
42ce02d
5a5ff31
c6693fe
906e5da
82b4c28
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -15,8 +15,8 @@ DATABASE_URL=mysql://figures_user:[email protected]:3306/figures-db | |
# Set which expected Open edX release mocks for devsite to use. | ||
# Valid options are: "GINKGO", "HAWTHORN", "JUNIPER" | ||
# | ||
# If not specified here, then "HAWTHORN" is used | ||
OPENEDX_RELEASE=JUNIPER | ||
# If not specified here, then "JUNIPER" is used | ||
# OPENEDX_RELEASE=JUNIPER | ||
|
||
# Enable/disable Figures multisite mode in devsite | ||
# This also requires | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,6 +1,7 @@ | ||
"""Helper functions to make data handling and conversions easier | ||
|
||
# Figures 0.3.13 - Defining scope of this module | ||
# Figures 0.4.x - Yep, this module is still handy and the scope hasn't exploded | ||
|
||
The purpose of this module is to provide conveniece methods around commonly | ||
executed statements. These convenience methods should serve as shorthand for | ||
|
@@ -34,6 +35,8 @@ | |
|
||
## What does not belong here? | ||
|
||
* Most importantly, if you have to import from another figures module, it does | ||
not belong here! | ||
* "Feature" functionality does not belong here | ||
* Long functions do not belong here | ||
* Code that communicates outside of Figures does not belong here. No database, | ||
|
@@ -42,7 +45,10 @@ | |
This is not an exhaustive list. We'll grow it as needed. | ||
|
||
An important point is that we should not expect this module to be a permanent | ||
home for functionality. | ||
home for the functionality included. As we evolve Figures, we may find functions | ||
here that have a stronger context with another module. For example, we've got | ||
a decent set of date oriented functions that are candidatdes for a datetime and | ||
date handling module. | ||
""" | ||
|
||
from __future__ import absolute_import | ||
|
@@ -188,6 +194,15 @@ def prev_day(val): | |
return days_from(val, -1) | ||
|
||
|
||
def utc_yesterday(): | ||
"""Get "yesterday" form the utc datetime | ||
|
||
We primarily use this for the daily metrics collection. However, it proves | ||
handy as a convenience function for exploring data in the Django shell | ||
""" | ||
return prev_day(datetime.datetime.utcnow().date()) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. My standard advice anytime I see def utc_yesterday(now=None):
if not now:
now = datetime.datetime.utcnow()
return prev_day(now.date()) This makes it considerably easier to test, particularly for the weird edge cases that come up around dates (daylight savings, leap years, etc.) and eliminate the flakey tests that can come up when tests potentially behave differently depending on when they are run (what happens when this code is run exactly at midnight?). Code that calls There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You raise an excellent point make a good argument. I started to draft a reply, but then the reply started to grow longer. So I moved that off for later follow up so I can focus on getting part 2 in PR and come back to this topic. I just wanted to thank you now for reviewing so soon after I posted the PR before the weekend |
||
|
||
|
||
def days_in_month(month_for): | ||
_, num_days_in_month = calendar.monthrange(month_for.year, month_for.month) | ||
return num_days_in_month | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,24 @@ | ||
# -*- coding: utf-8 -*- | ||
from __future__ import unicode_literals | ||
|
||
from django.db import migrations, models | ||
|
||
|
||
class Migration(migrations.Migration): | ||
|
||
dependencies = [ | ||
('figures', '0015_add_enrollment_data_model'), | ||
] | ||
|
||
operations = [ | ||
migrations.AddField( | ||
model_name='enrollmentdata', | ||
name='collect_elapsed', | ||
field=models.FloatField(null=True), | ||
), | ||
migrations.AddField( | ||
model_name='learnercoursegrademetrics', | ||
name='collect_elapsed', | ||
field=models.FloatField(null=True), | ||
), | ||
] |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -5,6 +5,7 @@ | |
|
||
from __future__ import absolute_import | ||
from datetime import date | ||
import timeit | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do you
I think it's only related to the following usage I think it's safe, but want to double check if we've missed something here. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Thank you Omar! To be on the safe side, I'll just capture start and end times without timeit |
||
from django.conf import settings | ||
from django.contrib.sites.models import Site | ||
from django.core.validators import MaxValueValidator, MinValueValidator | ||
|
@@ -17,7 +18,7 @@ | |
from model_utils.models import TimeStampedModel | ||
|
||
from figures.compat import CourseEnrollment | ||
from figures.helpers import as_course_key | ||
from figures.helpers import as_course_key, utc_yesterday | ||
from figures.progress import EnrollmentProgress | ||
|
||
|
||
|
@@ -203,7 +204,7 @@ class EnrollmentDataManager(models.Manager): | |
EnrollmentData instances. | ||
|
||
""" | ||
def set_enrollment_data(self, site, user, course_id, course_enrollment=False): | ||
def set_enrollment_data(self, site, user, course_id, course_enrollment=None): | ||
""" | ||
This is an expensive call as it needs to call CourseGradeFactory if | ||
there is not already a LearnerCourseGradeMetrics record for the learner | ||
|
@@ -259,6 +260,71 @@ def set_enrollment_data(self, site, user, course_id, course_enrollment=False): | |
defaults=defaults) | ||
return obj, created | ||
|
||
def update_metrics(self, site, course_enrollment, force_update=False): | ||
""" | ||
This is an expensive call as it needs to call CourseGradeFactory if | ||
there is not already a LearnerCourseGradeMetrics record for the learner | ||
|
||
""" | ||
date_for = utc_yesterday() | ||
|
||
# check if we already have a record for the date for EnrollmentData | ||
|
||
# Alternately, we could use a try/except on a 'get' call, however, this | ||
# would be much slower for a bunch of new enrollments | ||
|
||
# Should only be one even if we don't include the site in the query | ||
# because course_id should be globally unique | ||
# If course id is ever NOT globally unique, then we need to add site | ||
# to the query | ||
ed_recs = EnrollmentData.objects.filter( | ||
user_id=course_enrollment.user_id, | ||
course_id=str(course_enrollment.course_id)) | ||
|
||
if not ed_recs or ed_recs[0].date_for < date_for or force_update: | ||
# We do the update | ||
start_time = timeit.default_timer() | ||
# get the progress data | ||
ep = EnrollmentProgress(user=course_enrollment.user, | ||
course_id=str(course_enrollment.course_id)) | ||
defaults = dict( | ||
date_for=date_for, | ||
is_completed=ep.is_completed(), | ||
progress_percent=ep.progress_percent(), | ||
points_possible=ep.progress.get('points_possible', 0), | ||
points_earned=ep.progress.get('points_earned', 0), | ||
sections_possible=ep.progress.get('sections_possible', 0), | ||
sections_worked=ep.progress.get('sections_worked', 0), | ||
is_enrolled=course_enrollment.is_active, | ||
date_enrolled=course_enrollment.created, | ||
) | ||
elapsed = timeit.default_timer() - start_time | ||
defaults['collect_elapsed'] = elapsed | ||
|
||
ed_rec, created = self.update_or_create( | ||
site=site, | ||
user=course_enrollment.user, | ||
course_id=str(course_enrollment.course_id), | ||
defaults=defaults) | ||
# create a new LCGM record | ||
# if it already exists for the day | ||
LearnerCourseGradeMetrics.objects.update_or_create( | ||
site=ed_rec.site, | ||
user=ed_rec.user, | ||
course_id=ed_rec.course_id, | ||
date_for=date_for, | ||
defaults=dict( | ||
points_possible=ed_rec.points_possible, | ||
points_earned=ed_rec.points_earned, | ||
sections_worked=ed_rec.sections_worked, | ||
sections_possible=ed_rec.sections_possible, | ||
collect_elapsed=elapsed | ||
) | ||
) | ||
return ed_rec, created | ||
else: | ||
return ed_recs[0], False | ||
|
||
|
||
@python_2_unicode_compatible | ||
class EnrollmentData(TimeStampedModel): | ||
|
@@ -301,6 +367,9 @@ class EnrollmentData(TimeStampedModel): | |
sections_worked = models.IntegerField() | ||
sections_possible = models.IntegerField() | ||
|
||
# seconds it took to collect progress data | ||
collect_elapsed = models.FloatField(null=True) | ||
|
||
objects = EnrollmentDataManager() | ||
|
||
class Meta: | ||
|
@@ -324,7 +393,7 @@ def progress_details(self): | |
|
||
|
||
class LearnerCourseGradeMetricsManager(models.Manager): | ||
"""Custom model manager for LearnerCourseGrades model | ||
"""Custom model manager for LearnerCourseGradeMetrics model | ||
""" | ||
def latest_lcgm(self, user, course_id): | ||
"""Gets the most recent record for the given user and course | ||
|
@@ -414,9 +483,9 @@ class LearnerCourseGradeMetrics(TimeStampedModel): | |
Purpose is primarliy to improve performance for the front end. In addition, | ||
data collected can be used for course progress over time | ||
|
||
We're capturing data from figures.metrics.LearnerCourseGrades | ||
We're capturing data from figures.progress.EnrollmentProgress | ||
|
||
Note: We're probably going to move ``LearnerCourseGrades`` to figures.pipeline | ||
Note: We're probably going to move ``EnrollmentProgress`` to figures.pipeline | ||
since that class will only be needed by the pipeline | ||
|
||
Even though this is for a course enrollment, we're mapping to the user | ||
|
@@ -452,6 +521,9 @@ class LearnerCourseGradeMetrics(TimeStampedModel): | |
sections_worked = models.IntegerField() | ||
sections_possible = models.IntegerField() | ||
|
||
# seconds it took to collect progress data | ||
collect_elapsed = models.FloatField(null=True) | ||
|
||
objects = LearnerCourseGradeMetricsManager() | ||
|
||
class Meta: | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you mind adding a test case for this function? One happy scenario should be enough.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will do