Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix utf-8 encoding issues for all Python CSV exports #8003

Merged
merged 2 commits into from
Apr 14, 2021

Conversation

rtibbles
Copy link
Member

Summary

  • Consolidate CSV file opening for writes.
  • Uses utf-8-sig to mark BOM at beginning of file to ensure utf-8 decoding in Excel

References

Fixes #8001

Reviewer guidance

  • Export CSV files with unicode content
  • Open in Excel on Windows
  • Ensure that it properly opens with unicode encoding

Testing checklist

  • Contributor has fully tested the PR manually
  • If there are any front-end changes, before/after screenshots are included
  • Critical user journeys are covered by Gherkin stories
  • Critical and brittle code paths are covered by unit tests

PR process

  • PR has the correct target branch and milestone
  • PR has 'needs review' or 'work-in-progress' label
  • If PR is ready for review, a reviewer has been added. (Don't use 'Assignees')
  • If this is an important user-facing change, PR or related issue has a 'changelog' label
  • If this includes an internal dependency change, a link to the diff is provided

Reviewer checklist

  • Automated test coverage is satisfactory
  • PR is fully functional
  • PR has been tested for accessibility regressions
  • External dependency files were updated if necessary (yarn and pip)
  • Documentation is updated
  • Contributor is in AUTHORS.md

kolibri/core/utils/csv.py Show resolved Hide resolved
@pcenov
Copy link
Member

pcenov commented Apr 14, 2021

@rtibbles let me know when I can retest this after the additional code changes. As I mentioned earlier in Slack I didn't see any issues in Windows when opening .csv files with UTF content. On a side note there's one outstanding encoding related issue which I've reported here: #7978 - just want to bring it to your attention.

@rtibbles rtibbles merged commit 3cb33f2 into learningequality:release-v0.14.x Apr 14, 2021
@rtibbles rtibbles deleted the utf-8part2 branch April 14, 2021 16:00
@rtibbles
Copy link
Member Author

Thanks @pcenov - I'll take a quick look to see if I can resolve it.

@pcenov
Copy link
Member

pcenov commented Apr 15, 2021

Retested today on both Win 10 and Ubuntu with the latest release candiate https://github.com/learningequality/kolibri/releases/tag/v0.14.7-rc4 - the .csv files with UTF content are exported correctly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
TODO: needs review Waiting for review
Projects
None yet
3 participants