From 795e0b4c42f243e57797bf24e52840eca389f381 Mon Sep 17 00:00:00 2001 From: Owen Leibman Date: Sat, 27 Jun 2020 09:25:17 -0700 Subject: [PATCH 1/5] Improving Coverage for Excel2003 XML Reader Reader/Xml is now 100% covered. File templates/Excel2003XMLTest.xml, used in some tests, is *not* readable by a current version of Excel. I have substituted a new file excel2003.xml to be used in its place. I have not deleted the original in case someone in future (possibly me) wants to see what it needs to make it usable. There are minimal code changes. - Unused protected functions pixel2WidthUnits and widthUnits2Pixel are deleted. - One regex looking to convert hex characters is changed from a-z to a-f, and made case insensitive. - No calculation performed for "error" cell (previously calculation was attempted and threw exception). - Empty relative row/cell is now handled correctly. - Style applied to empty cell when appropriate. - Support added for textRotation. - Support added for border styles. - Support added for diagonal borders. - Support added for superscript and subscript. - Support added for fill patterns. In theory, encodings other than UTF-8 were supported. In fact, I was unable to get SecurityScanner to pass *any* xml which is not UTF-8. Eliminating the assumption that strings might not be UTF-8 allowed much of the code to be greatly simplified. After that, I added some code that would permit the use of some ASCII-compatible encodings (there is a test of ISO-8859-1). It would be more difficult to handle other encodings (such as UTF-16). I am not convinced that even the ISO-8859 effort is worth it, but am willing to investigate either expanding or eliminating non-UTF8 support. I added a number of tests, creating an Xml directory, and moving XmlTest to that directory. Pull Request had problems reading old invalid sample in the code coverage phase, not in any of the other test phases, and not in the code coverage phase on my local machine. As it turns out, aside from being invalid, the sample is much larger than any of the other samples. Tests have been adjusted accordingly. --- samples/Basic/20_Read_Excel2003XML.php | 2 +- samples/templates/excel2003.xml | 941 ++++++++++++++++++ src/PhpSpreadsheet/Reader/BaseReader.php | 23 +- src/PhpSpreadsheet/Reader/Xml.php | 279 +++--- tests/PhpSpreadsheetTests/IOFactoryTest.php | 4 +- .../Reader/Xml/XmlFilter.php | 14 + .../Reader/Xml/XmlInfoTest.php | 75 ++ .../Reader/Xml/XmlLoadTest.php | 174 ++++ .../Reader/Xml/XmlStylesTest.php | 113 +++ .../Reader/{ => Xml}/XmlTest.php | 9 +- tests/data/Reader/Xml/excel2003.iso8859-1.xml | 67 ++ 11 files changed, 1571 insertions(+), 130 deletions(-) create mode 100644 samples/templates/excel2003.xml create mode 100644 tests/PhpSpreadsheetTests/Reader/Xml/XmlFilter.php create mode 100644 tests/PhpSpreadsheetTests/Reader/Xml/XmlInfoTest.php create mode 100644 tests/PhpSpreadsheetTests/Reader/Xml/XmlLoadTest.php create mode 100644 tests/PhpSpreadsheetTests/Reader/Xml/XmlStylesTest.php rename tests/PhpSpreadsheetTests/Reader/{ => Xml}/XmlTest.php (81%) create mode 100644 tests/data/Reader/Xml/excel2003.iso8859-1.xml diff --git a/samples/Basic/20_Read_Excel2003XML.php b/samples/Basic/20_Read_Excel2003XML.php index 44425e20a5..48ac3373c5 100644 --- a/samples/Basic/20_Read_Excel2003XML.php +++ b/samples/Basic/20_Read_Excel2003XML.php @@ -4,7 +4,7 @@ require __DIR__ . '/../Header.php'; -$filename = __DIR__ . '/../templates/Excel2003XMLTest.xml'; +$filename = __DIR__ . '/../templates/excel2003.xml'; $callStartTime = microtime(true); $spreadsheet = IOFactory::load($filename); $helper->logRead('Xml', $filename, $callStartTime); diff --git a/samples/templates/excel2003.xml b/samples/templates/excel2003.xml new file mode 100644 index 0000000000..b6d57d98ff --- /dev/null +++ b/samples/templates/excel2003.xml @@ -0,0 +1,941 @@ + + + + + Xml2003 Workbook + Test Gnumeric Workbook Subject + Mark Baker + PHPExcel Xml Reader Test Keywords + Some comments about the PHPExcel Gnumeric Reader + Owen Leibman + 2010-09-02T20:48:39Z + 2010-09-02T20:48:39Z + PHPExcel Xml Reader Test Category + Maarten Balliauw + PHPExcel + https://github.com/PHPOffice/PhpSpreadsheet + 16.00 + + + AbCd1234 + 2019-01-31T07:00:00Z + 1 + 3 + 2 + 3.14159 + + + + + + 8964 + 23040 + 32767 + 32767 + False + False + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + Test String 1Test for a simple colour-formatted string + 1 + 5 + A + E + 6 + AE + + + Test - String 2 + 2 + 6 + B + F + 8 + BF + Dot + + + Test #3 + 3 + 7 + C + G + 10 + CG + Red + Red + Dash + + + Test with (") in string + 4 + 8 + D + H + 12 + DH + Orange + Orange + Dash/Dot/Dot + + + 10 + 26 + 36 + Yellow + Yellow + Dash/Dot + + + Test #3 + 1.23 + 1 + 1 + 22 + Green + Green + Thin Line + + + Test #3 + 2.34 + 0 + 0 + 36 + Blue + Blue + Thick Dash/Dot/Dot + + + Test #3 + 3.45 + Purple + Purple + Variant Thick Dash/Dot/Dot + + + Pink + Pink + Thick Dash/Dot + + + 1960-12-19T00:00:00.000 + TOP + 0 + Brown + Brown + Thick Dash + + + 1.5 + #DIV/0! + Thick Line + + + BOTTOM + Extra Thick Line + + + 1899-12-31T02:30:00.000 + Мойва сушенаяTests for UTF-8 content + Double Line + + + LEFT + Ärendetext + + + 1960-12-19T01:30:00.000 + Højde + + + RIGHT + + + + BOX + + Test Column 1 + + + + + + + Test Column 2 + + Patterned + Patterned 2 + + + + + Test Column 3 + + + PhpSpreadsheet + + + + + + + + + + + + + + + + Underline None + + Rotate 90 + Rotate 45 + Rotate -90 + Rotate -45 + + + + + + + Underline 1 + Subscript + + + + + + + Underline 2 + Superscript + + + + + + + Underline 3 + + + + + + + + Underline 4 + + + + + + + + + + + + + + + + + + + + + + + + I don't know if Gnumeric supports Rich Text in the same way as Excel, And this row should be autofit height with text wrap + + + + + + + + + + + + + + Blue with underline + + + + + + + + + + + + + + 5 + 5 + #NAME? + + + + + + + + + + + + + + Hidden row above + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
+ + +
+