Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hotfix/2271 export csv data quality issues #2742

Conversation

djallado
Copy link
Contributor

@djallado djallado commented Dec 1, 2014

@aronasorman.

Issue: #2271 export csv data quality issues

Video Log is already Fixed. Total Videos in CSV report is now the same with Videos Viewed in web view

For now, narrowing out date range in less than a month to generate CSV report causes total_hours column different from web view Login Time.
Reason:

  1. Date range of UserLogSummary is at least one month.

Recommendation:
We will use UserLog instead of UserLogSummary.
Problems of using UserLog:

  1. Not synced in central server
  2. We cannot replicate the issue using generaterealdata because generaterealdata generates exercise into a single UserLog

@aronasorman aronasorman self-assigned this Dec 1, 2014
@aronasorman aronasorman modified the milestones: 0.13.x, 0.12.x Dec 1, 2014
@aronasorman
Copy link
Collaborator

@rtibbles you may want to take a look as well.

@aronasorman aronasorman assigned rtibbles and unassigned aronasorman Dec 1, 2014
@djallado
Copy link
Contributor Author

djallado commented Dec 2, 2014

@mikewray.

Instructions to get new changes in CSV Report:

  1. In your console, type in git clone https://[email protected]/mrpau/ka-lite or you may specify a directory like:
    git clone https://[email protected]/mrpau/ka-lite yourDirectory.
  2. Once the cloning is done, run cd yourDirectory if you specify a directory during cloning, and run cd ka-lite otherwise just run cd ka-lite.
  3. Run git checkout hotfix/2271-Export-CSV-data-quality-issues to get into our changes in CSV report.
  4. Run cd kalite.
  5. Run python manage.py kaserve
  6. Then visit 127.0.0.1:8008 in your browser.

In our manual test Video Log is already Fixed.
total_hours column in CSV Report will work fine if we select a whole month for our date range.

Try to check export CSV with your actual data.
Let us know if Video Logs and total_hours is still an issue.

@djallado
Copy link
Contributor Author

djallado commented Dec 2, 2014

Hi @aronasorman.

We set USER_LOG_SUMMARY_FREQUENCY to 1 day instead of 1 month because we want to narrow down our date range in less than a month. We manually test the export CSV and it generates exact report.

@mikewray. In our last commit, we fixed the issue in total_hours column of CSV report.

@mikewray
Copy link

mikewray commented Dec 2, 2014

Hi there. Can you confirm the total_hours will work in any circumstance, so can cross months e.g. i.e. an inclusive report for the whole period of existence will equal the sum of two or more reports.

I've been doing all this testing through the central server, the data derives from 3 Rpi's in the field in Zambia. Can you put this change into staging or live and I can test it there? I don't have the test data on my laptop and think it will take me while to find an old data set that will work for this test. Much easier to work with data I know and I can run the same tests I've been doing.

Thanks for your work on this

@aronasorman
Copy link
Collaborator

Code looks good. Merging so we can test it to the central server.

aronasorman added a commit that referenced this pull request Dec 2, 2014
…ty-issues

Hotfix/2271 export csv data quality issues
@aronasorman aronasorman merged commit 3c4e583 into learningequality:master Dec 2, 2014
@aronasorman aronasorman deleted the hotfix/2271-Export-CSV-data-quality-issues branch December 2, 2014 17:34
@aronasorman
Copy link
Collaborator

Just noticed that I merged this to master!!! Noooo!!

Next time, please target release-0.12.0.

@aronasorman aronasorman restored the hotfix/2271-Export-CSV-data-quality-issues branch December 2, 2014 21:24
@aronasorman
Copy link
Collaborator

Manually merged to release-0.12.0

@mikewray
Copy link

mikewray commented Dec 2, 2014

Let me know when I can test on CS. Tx!
On 2 Dec 2014 22:25, "Aron Fyodor Asor" [email protected] wrote:

Manually merged to release-0.12.0


Reply to this email directly or view it on GitHub
#2742 (comment)
.

@aronasorman
Copy link
Collaborator

Hi @mikewray! Should be within the next 4 hours. I'm just waiting for the tests to pass, then will merge it to the central server and update it.

@mikewray
Copy link

mikewray commented Dec 3, 2014

I just ran a quick test on central server and looks like changes not taken effect there as getting same results as previous for both hours and videos column. Tx again - let me know.

@cpauya
Copy link
Contributor

cpauya commented Dec 3, 2014

Hi @mikewray - I think @aronasorman hasn't updated the central server yet.

We'll keep you posted.

@aronasorman
Copy link
Collaborator

Hi @mikewray, I've updated the central server, so you should be able to test it now. Once we've got the confirmation from you I'll issue the 0.12.9 update and you should be able to download it!

@aronasorman aronasorman deleted the hotfix/2271-Export-CSV-data-quality-issues branch December 4, 2014 02:47
@djallado
Copy link
Contributor Author

djallado commented Dec 4, 2014

I'd test in my local central server and I found out that total_videos is not equal to total_videos in distributed. The reason is that, central server must capture the data up to 12 midnight of the end date.

@mikewray
Copy link

mikewray commented Dec 4, 2014

HI - just run similar to test to the one I did before off central server. Data is exactly the same as previous unfortunately, see summary below. I'm happy to email login details to do the tests themselves to replicate. Just in case of consequence, the data comes from Rpi with version 0.12.5 on them.

csv report 271114

@cpauya
Copy link
Contributor

cpauya commented Dec 4, 2014

Hi mike, please try to check at the staging server at http://staging.learningequality.org/organization/ because that's what @aronasorman has updated.

I don't know if staging has your data but maybe you can upload/sync some sample data for your test?

@djallado will issue a fix for central server on another PR where the video logs does not include the data for the end date.

@mikewray
Copy link

mikewray commented Dec 4, 2014

Hi - just tried staging. The data was there and agrees to production on the web view. The csv extract is now pulling zero total_hours for the whole or part of the period. I can give you login if you need to do it yourself. Screen print below:

csv report staging 271114

@cpauya
Copy link
Contributor

cpauya commented Dec 4, 2014

I have your login credentials and have tried it also. We will try to sync some sample data on staging under my sample organization to check too.

@cpauya
Copy link
Contributor

cpauya commented Dec 4, 2014

I have tried to sync data based on generaterealdata from my local to staging and it has the same results. Will investigate more after dinner.

Staging Server
screenshot 2014-12-04 19 22 01

Local Server
screenshot 2014-12-04 19 22 16

@djallado
Copy link
Contributor Author

djallado commented Dec 5, 2014

I run generaterealdata in my local and I sync my data to staging and then generate CSV report both local and staging.

Screenshots are attached bellow:
Data on my local and staging are now equal.

local: 2014-11-1 to 2014-11-30.
distributed1-30
staging: 2014-11-1 to 2014-11-30.
staging1-30

local: 2014-11-9 to 2014-11-11.
distributed-9-11

staging: 2014-11-9 to 2014-11-11.
staging-9-11

@mikewray
Copy link

mikewray commented Dec 5, 2014

I'm observing the same issues with my data on both distributed and staging, was something meant to have changed? Tx!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants