Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hotfix/2271 export csv data quality issues. #2762

Conversation

djallado
Copy link
Contributor

@djallado djallado commented Dec 5, 2014

@aronasorman.

Issue: #2271 export csv data quality issues

We used Mikewray's data from staging and we found out that some of completion_timestamp column of VideoLog contains a null value. So we included that null value of completion_timestamp in video_logs query.

Screenshots are attached bellow.

CSV report
CSV-Report

Web-view
web-view

@cpauya
Copy link
Contributor

cpauya commented Dec 5, 2014

Hi @mikewray - this will hopefully be the fix for #2271

@aronasorman aronasorman modified the milestones: 0.13.x, 0.12.x Dec 5, 2014
@aronasorman aronasorman self-assigned this Dec 5, 2014
aronasorman added a commit that referenced this pull request Dec 5, 2014
…ty-issues

Hotfix/2271 export csv data quality issues.
@aronasorman aronasorman merged commit 4123234 into learningequality:release-0.12.0 Dec 5, 2014
@aronasorman aronasorman removed the has PR label Dec 5, 2014
@aronasorman aronasorman deleted the hotfix/2271-Export-CSV-data-quality-issues branch December 5, 2014 17:06
@mikewray
Copy link

mikewray commented Dec 5, 2014

Is this live and if so which in environment, did a little test on both environments and issue still exists. I can check again later on, let me know. Tx again for all your on this!

@aronasorman
Copy link
Collaborator

Hi @mikewray, should be live in the next hour or so. Will reply when it's ready for testing!

@aronasorman
Copy link
Collaborator

Hi @mikewray, this should be live on the staging server now!

@cpauya
Copy link
Contributor

cpauya commented Dec 5, 2014

Just an FYI for @mikewray that I tested by syncing my local distributed to his account at staging - so he might see my sample data there.

I've tested the Export CSV filters on staging and so far it works.

@mikewray
Copy link

mikewray commented Dec 5, 2014

Hi, just did test on staging. Videos changed but still not adding up i.e. total does not equal sum of two periods. Total_hours giving back zero regardless of period as before. See below, sorry! You can go into Edulution Coaches and try do the same report.

csv report staging 051214

@cpauya
Copy link
Contributor

cpauya commented Dec 5, 2014

Hi @mikewray -

I took screenshots of your data on my synced local:

  1. On the admin video log screenshot below (filtered for user Diana Banda) - you can see that some Completion_timestamp column data are (None), that means those are partially watched videos which we've included in the CSV export.
    screenshot 2014-12-06 06 40 39
  2. On the admin user log screenshot below (filtered for user Diana Banda) - the data uploaded are summed-up in monthly basis, in this case it's Nov 2014. Thus the reason total_hours and total_logins are zero. Try to change your date range up to Nov 30 to see the login hours.
    screenshot 2014-12-06 06 40 44

Here's a duly formatted .csv export file for Period 2014-Oct-01 to 2014-Nov-30:
screenshot 2014-12-06 06 46 04

So to answer your query regarding "summing-up two periods":

  1. We cannot filter dates unless the Completion_timestamp column have a value. I don't know how to deal with partially watched videos because we don't have timestamps on them yet. @aronasorman Do we have a way to detect when the VideoLog was started? (coz I didn't see any field for that)
  2. If we include partially watched videos like above, we will always have inconsistent results between two date range periods (1st semester and 2nd semester). And if we exclude them, our .csv export results will be different from the web view.

Suggestions regarding your data:

  1. Set local_settings.py value for USER_LOG_SUMMARY_FREQUENCY = (1,"day") on the distributed/RPi devices so we can filter by specific dates and achieve correct login hours results.
  2. To be honest, I don't have an idea how to deal with the null values for the VideoLog.completion_timestamp field. Maybe @aronasorman can help?

@aronasorman
Copy link
Collaborator

So I think we've hit the wall here as to what we can fix for the 0.12.x line. We don't track partially watched videos which means that getting a subset of VideoLogs will sometimes yield an incorrect result for certain columns.

This hopefully is fixed by 0.13, which has the ContentLog model. This provides more granular timestamps such as the progress of the user in a video, as well as the last time they viewed it. We'll keep you posted @mikewray!

@mikewray
Copy link

mikewray commented Dec 6, 2014

Hi - firstly tx u all. Lets finish this:

  1. Can I suggest for videos just to report on fully watched videos - this works, the uncompleted videos makes no sense. I know it is different to web view but we know what difference is and at least report is internally consistent.
  2. I will add to local_settings.py config, please confirm exact syntax is USER_LOG_SUMMARY_FREQUENCY = (1,"day")
  3. So until I have the above, I should only run monthly reports i.e. don't cross month boundaries. This seems to work e.g. Monica Zimba Oct activity + Nov activity = total activity. Confirm?

Tx again for all your time!

@aronasorman
Copy link
Collaborator

Hi @mikewray:

  1. Yup, I think that's the current behaviour right now. Can you confirm @cpauya?
  2. Yup, syntax looks right!
  3. Yup, that's the recommended way for now. @cpauya correct me if I'm wrong.

With that said, v0.12.9 is now released! You should be able to grab it here.

@cpauya
Copy link
Contributor

cpauya commented Dec 6, 2014

Let me verify item 1. - I think we are counting the partially watched videos too.

Will update you when fixed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants