Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve HTML Writer #1464

Merged
merged 9 commits into from
May 18, 2020
Merged

Improve HTML Writer #1464

merged 9 commits into from
May 18, 2020

Conversation

oleibman
Copy link
Collaborator

@oleibman oleibman commented May 11, 2020

This is:

- [x] a bugfix
- [ ] a new feature

Checklist:

Why this change is needed?

There are a number of situations where HTML write was producing
HTML which could not be validated. These include:

  • inconsistent use of backslash terminating META, IMG, and COL tags
  • @page style tags in body rather than header. Aside from being
    non-standard, HTML Reader treats those as spreadsheet data.
  • <div style="page-break-before:always" />, a construct which is
    usually better handled through css anyhow.
  • no alt tag for images (drawings and charts)

Other problems:

  • Windows file names not handled correctly for images
  • Memory drawings not handled in extendRowsForChartsAndImages
  • No handling of different values for showing gridlines
    for screen and print
  • Mpdf and Dompdf do not require the use of inline css.
    Tcpdf remains a holdout in the use of this inferior approach.
  • no need to chunk base64 encoding of embedded images
  • support for colors in number format was buggy (html tags
    run through htmlspecialchars)

Code has been refactored when practical to reduce the number of
very large functions.

Coverage is now 100% for the entire HTML Writer module,
from 75% lines and 39% methods beforehand.

All functions dealing only with charts
are bypassed for coverage because the version of Jpgraph available in
Composer is not suitable for PHP7. The code will, nevertheless,
run successfully, but with warning messages. I have confirmed that
the code is entirely covered, without warnings, when the current
version of Jpgraph is used in lieu of the one available in Composer.
I will be glad to revisit this when the Jpgraph problem is resolved.

Directory PhpSpreadsheetTests/Writer/Html was created to house
the new tests. It seemed logical to move HtmlCommentsTest to
the new directory from PhpSpreadsheetTests/Functional.

A function to generate all the HTML is useful, especially for testing,
but also in lieu of the multiple other generate* functions. I have
added and documented generateHTMLAll.

The documentation for the generate* functions (a) produces invalid html,
(b) produces html which cannot be handled correctly by HTML reader,
and (c) even if those were correct, does not actually affect
the display of the spreadsheet. The documentation has been replaced
by a valid, and more instructive, example.

The (undocumented) useEmbeddedCss property, and the functions
to test and set it are no longer needed. Rather than breaking
existing code by deleting them, I marked the functions deprecated.

This change borrows a change to LocaleFloatsTest from
pull request 1456, submitted a little over a week before this one.

Resync with base project
There are a number of situations where HTML write was producing
HTML which could not be validated. These include:
  - inconsistent use of backslash terminating META, IMG, and COL tags
  - @page style tags in body rather than header. Aside from being
    non-standard, HTML Reader treats those as spreadsheet data.
  - <div style="page-break-before:always" />, a construct which is
    usually better handled through css anyhow.
  - no alt tag for images (drawings and charts)

Other problems:
  - Windows file names not handled correctly for images
  - Memory drawings not handled in extendRowsForChartsAndImages
  - No handling of different values for showing gridlines
    for screen and print
  - Mpdf and Dompdf do not require the use of inline css.
    Tcpdf remains a holdout in the use of this inferior approach.
  - no need to chunk base64 encoding of embedded images
  - support for colors in number format was buggy (html tags
    run through htmlspecialchars)

Code has been refactored when practical to reduce the number of
very large functions.

Coverage is now 100% for the entire HTML Writer module,
from 75% lines and 39% methods beforehand.

All functions dealing only with charts
are bypassed for coverage because the version of Jpgraph available in
Composer is not suitable for PHP7. The code will, nevertheless,
run successfully, but with warning messages. I have confirmed that
the code is entirely covered, without warnings, when the current
version of Jpgraph is used in lieu of the one available in Composer.
I will be glad to revisit this when the Jpgraph problem is resolved.

Directory PhpSpreadsheetTests/Writer/Html was created to house
the new tests. It seemed logical to move HtmlCommentsTest to
the new directory from PhpSpreadsheetTests/Functional.

A function to generate all the HTML is useful, especially for testing,
but also in lieu of the multiple other generate* functions. I have
added and documented generateHTMLAll.

The documentation for the generate* functions (a) produces invalid html,
(b) produces html which cannot be handled correctly by HTML reader,
and (c) even if those were correct, does not actually affect
the display of the spreadsheet. The documentation has been replaced
by a valid, and more instructive, example.

The (undocumented) useEmbeddedCss property, and the functions
to test and set it are no longer needed. Rather than breaking
existing code by deleting them, I marked the functions deprecated.

This change borrows a change to LocaleFloatsTest from
pull request 1456, submitted a little over a week before this one.
Almost all involve changing:
$x = IOFactory::createWriter($spreadsheet, 'Html')
to
$x = new Html($spreadsheet)
First phase of this change included correcting NumberFormat handling
in HTML Writer. Certain complex formats could not be handled without
changes to Style/NumberFormat, and I did not wish to combine those changes.

Once the original change had been pushed, I took this part of it back up.
HTML Writer can now handle conditions in formats like:
[Blue][>=3000.5]$#,##0.00;[Red][<0]$#,##0.00;$#,##0.00
In testing, I discovered several errors and omissions
in handling of some other formats.
These are now corrected, and tests added.
Remove now-unused private variable from Writer/PDF.
Copy link
Member

@PowerKiKi PowerKiKi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That looks very interesting. Would you be able to resolve conflicts and squash ? Also there are method arguments that should be typed, since this will be PHP 7.2+ you should probably be able to type everything.

@oleibman
Copy link
Collaborator Author

I have resolved the conflicts. After that, Travis CI complained about tests which it had not complained about before - it appears that it changed what it considers the current directory. After fixing those, Scrutinizer complained about something which it had not complained about before. Everything is now fixed, all tests pass, no new issues. So I think this is ready to go. I shall take care to declare types for future changes.

@oleibman oleibman requested a review from PowerKiKi May 18, 2020 01:18
@PowerKiKi PowerKiKi merged commit 97a80f3 into PHPOffice:master May 18, 2020
@PowerKiKi
Copy link
Member

Thanks again 🎉

Since now all new code MUST be typed. Existing code that you touch SHOULD be typed too, but we have to be careful to avoid breaking things. Because of the nature of this project, we rely a lot on type flexibility, so it is likely that we will never be able to introduce typing in some places. Typically all calculation things might be hard to type, but our "models", readers and writers should be typable.

@oleibman oleibman deleted the shtml branch May 19, 2020 04:08
PowerKiKi added a commit that referenced this pull request May 31, 2020
### Added

- Support writing to streams in all writers [#1292](#1292)
- Support CSV files with data wrapping a lot of lines [#1468](#1468)
- Support protection of worksheet by a specific hash algorithm [#1485](#1485)

### Fixed

- Fix Chart samples by updating chart parameter from 0 to DataSeries::EMPTY_AS_GAP [#1448](#1448)
- Fix return type in docblock for the Cells::get() [#1398](#1398)
- Fix RATE, PRICE, XIRR, and XNPV Functions [#1456](#1456)
- Save Excel 2010+ functions properly in XLSX [#1461](#1461)
- Several improvements in HTML writer [#1464](#1464)
- Fix incorrect behaviour when saving XLSX file with drawings [#1462](#1462),
- Fix Crash while trying setting a cell the value "123456\n" [#1476](#1481)
- Improved DATEDIF() function and reduced errors for Y and YM units [#1466](#1466)
- Stricter typing for mergeCells [#1494](#1494)

### Changed

- Drop support for PHP 7.1, according to https://phpspreadsheet.readthedocs.io/en/latest/#php-version-support
- Drop partial migration tool in favor of complete migration via RectorPHP [#1445](#1445)
- Limit composer package to `src/` [#1424](#1424)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants