Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance and test improvement for LearnerMetricsViewSet #260

Merged
merged 12 commits into from
Sep 28, 2020

Conversation

johnbaldwin
Copy link
Contributor

@johnbaldwin johnbaldwin commented Sep 21, 2020

https://appsembler.atlassian.net/browse/RED-1261

  • Added prefetch_related to the User model in LearnerMetricsViewSet to query the related "CourseEnrollment" records
  • Updated test coverage to include users, courses, and enrollments for two sites in order to ensure we are retrieving only the data we want and avoid "site bleed". For this, created an initial generalized fixture in "tests/conftest.py". This initial fixture contains User, CourseOverview, CourseEnrollment, Organization, and Site records. If the version of "organizations" is our fork (supports multi-site) then OrganizationCourse and UserOrganizationMapping records are also filled
  • Updated devsite to support multi-tenant mode (Appsembler's organizations fork). For this, added django-environ to be able to set multi-tenant mode in the devsite/devsite/.env file. This is the default location django-environ seeks

This commit lays the groundwork for additional performance imporovement to reduce the number of queries needed in the serializers used for LearnerMetricsViewSet

@johnbaldwin johnbaldwin changed the title John/learner metrics performance Performance and test improvement for LearnerMetricsViewSet Sep 21, 2020
* Added prefetch_related to the User model in LearnerMetricsViewSet to
query the related "CourseEnrollment" records
* Updated test coverage to include users, courses, and enrollments for
two sites in order to ensure we are retrieving only the data we want and
avoid "site bleed". For this, created an initial generalized fixture in
"tests/conftest.py". This initial fixture contains User, CourseOverview,
CourseEnrollment, Organization, and Site records. If the version of
"organizations" is our fork (supports multi-site) then
OrganizationCourse and UserOrganizationMapping records are also filled

This commit lays the groundwork for additional performance imporovement
to reduce the number of queries needed in the serializers used for
LearnerMetricsViewSet
@johnbaldwin johnbaldwin force-pushed the john/learner-metrics-performance branch from a8490dd to 1a60f0b Compare September 21, 2020 15:46
@codecov-commenter
Copy link

codecov-commenter commented Sep 21, 2020

Codecov Report

Merging #260 into master will increase coverage by 0.07%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #260      +/-   ##
==========================================
+ Coverage   91.87%   91.95%   +0.07%     
==========================================
  Files          41       41              
  Lines        2142     2150       +8     
==========================================
+ Hits         1968     1977       +9     
+ Misses        174      173       -1     
Impacted Files Coverage Δ
figures/sites.py 68.81% <ø> (ø)
figures/models.py 95.31% <100.00%> (-0.05%) ⬇️
figures/serializers.py 96.83% <100.00%> (+0.37%) ⬆️
figures/views.py 93.00% <100.00%> (+0.14%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 122c879...9258b38. Read the comment docs.

* Added django-environ to devsite settings.py in order to allow setting
multisite versus standalone mode in the devsite/devsite/.env file
* Updated the requirements files to add `django-environ`
@johnbaldwin johnbaldwin force-pushed the john/learner-metrics-performance branch from 54ea79c to 09f5b9c Compare September 23, 2020 19:39
We were doing two queries before. The code now does one query when LCGM
"most_recent_for_learner_course" is called. We may still run into
performance issues because this means we need to do an individual query
per learner and can't get a bunch of learners most recent LCGM records
in a single call without either restructuring or augmenting out data.
One solution to enabling bulk retrieval is to create an augmented course
data table that has a 1:1 to CourseEnrollment that has the most recent
grade data or links to the most recent LCGM record
* Now LearnerMetricsViewSet only lists users who have an enrollment. Users
who have registerd but not enrolled are not listed
* Improved query performance, adding prefetching on users course
enrollments
@johnbaldwin johnbaldwin marked this pull request as ready for review September 24, 2020 15:25
We have a ListSerializer for the learner metrics serialzier, but it is
not yet used. How this works now is that we do the query optimization in
the view `get_queryset` method. Then the results are processed in the
serializer
@OmarIthawi
Copy link
Contributor

@johnbaldwin is this ready for review? I see that you're still pushing new big changes.

Removed the `to_representation` method from
LearnerMetricsListSerializer because we don't need it
@johnbaldwin
Copy link
Contributor Author

Looks like codecov got stuck again

There were missing tests. We checked for each learner, but didn't
check for each learner's enrollments. What was happening with the
`prefetch_related` is that it can't filter just the enrollments we want
in the related table (or at least I haven't found a way to do this yet)
so although we filtered on the learners, we weren't able to filter on
the learner enrollments. So now we need to filter the enrollments for
each learner against the course id list.

Now, we don't strictly need to do this. The front end will filter out
unwanted courses. However, when the query params say "I want these
courses", then it is good form to just return those courses. If we run
into performance issues, then we can return all courses to reduce the
query count and come up with an improved solution
@johnbaldwin
Copy link
Contributor Author

@OmarIthawi This should be ready for review now. Not sure why codecov is getting stuck again

@johnbaldwin
Copy link
Contributor Author

@OmarIthawi This is ready for review, please. I merged in the Ginkgo fix.

@@ -876,14 +868,14 @@ def get_enrollments(self, user):
Rely on the caller (the view) to filter users and prefetch related
"""

user_enrollments = user.courseenrollment_set.all()
# user_enrollments = user.courseenrollment_set.all()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# user_enrollments = user.courseenrollment_set.all()

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll remove the dead code. Thanks

@johnbaldwin johnbaldwin merged commit 3d5d7d0 into master Sep 28, 2020
@johnbaldwin johnbaldwin deleted the john/learner-metrics-performance branch September 28, 2020 11:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants