Writer Xls Handle Characters Outside Unicode BMP #3696
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Fix #642. Opened over 5 years ago, probably the oldest problem I've worked on. And PHPOffice/PHPExcel#1320, opened a year before that. And SpartnerNL/Laravel-Excel#1521.
Shared/StringHelper::UTF8toBIFF8UnicodeLong calculates incorrect length for strings when they contain characters outside Unicode BMP. Xls uses UTF-16 to encode its strings, and characters outside BMP require a surrogate pair to encode. PhpSpreadsheet (and PhpExcel before it) have been counting these as a single character, but Excel counts them as 2. Change to compute the length as half the number of bytes in the UTF-16 string, as Excel does.
A formal test is added, but it's a bit difficult to follow. So I aso added a non-BMP emoji to sample 27template.xls, which will cause it to be both read by Xls reader and written by Xls writer. This would previously have created a corrupt worksheet. The emoji is now handled correctly.
This is:
Checklist:
Why this change is needed?
Provide an explanation of why this change is needed, with links to any Issues (if appropriate).
If this is a bugfix or a new feature, and there are no existing Issues, then please also create an issue that will make it easier to track progress with this PR.