Use Less Space When Inserting Rows and Columns #3856
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Fix #3687. Worksheet methods insertNewRowBefore and insertNewColumnBefore call ReferenceHelper insertNewBefore. That function fills in "missing" cells with null values. However, for boundaries, it uses getHighestRow and getHighestColumn. It should be sufficient to use getHighestDataRow and getHighestDataColumn. When there is a big gap between getHighest... and getHighestData..., this can result in a big increase in memory usage, and in file space when saving the spreadsheet. New test InsertTest demonstrates the problem by populating a worksheet with cells A1:D5 (so highestDataRow is 5), but also setting row 1000 to invisible (so highestRow is 1000).
The major part of the change is in ReferenceHelper::insertNewBefore, which will now use getHighestData... for its boundaries when filling in the missing cells. Changes of less impact are made to duplicateStylesByColumn and duplicateStylesByRow so that cells which don't yet exist are not created unless the style that will be applied is not the workbook default style.
As for reducing the file size, Writer/Xlsx/Worksheet is changed so that cells whose value is null or null-string and which use the workbook default style are not written to the output spreadsheet. This requires some changes to existing test ReadBlankCellsTest; I don't think the difference should matter to the end-user.
This is:
Checklist:
Why this change is needed?
Provide an explanation of why this change is needed, with links to any Issues (if appropriate).
If this is a bugfix or a new feature, and there are no existing Issues, then please also create an issue that will make it easier to track progress with this PR.