Skip to content

Commit

Permalink
Gnumeric Better Namespace Handling
Browse files Browse the repository at this point in the history
There have been a number of issues concerning the handling of legitimate but unexpected namespace prefixes in Xlsx spreadsheets created by software other than Excel and PhpSpreadsheet/PhpExcel.I have studied them, but, till now, have not had a good idea on how to act on them. A recent comment PHPOffice#860 (comment) in issue PHPOffice#860 by @IMSoP has triggered an idea about how to proceed.

Although the issues exclusively concern Xlsx format, I am starting out by dealing with Gnumeric. It is simpler and smaller than Xlsx, and, more important, already has a test for an unexpected prefix, since, at some point, it changed its generic prefix from gmr to gnm. I added support and a test for that some time ago, but almost certainly not in the best possible manner. The code as changed for this PR seems simpler and less kludgey, both for that exceptional case as well as for normal handling.

My hope is that this change can be a template for similar Reader changes for Xml, Ods, and, especially, Xlsx.

All grandfathered Phpstan issues with Gnumeric are fixed and eliminated from baseline as part of this change.
  • Loading branch information
oleibman committed Apr 25, 2021
1 parent b05dc31 commit 84096ae
Show file tree
Hide file tree
Showing 5 changed files with 149 additions and 186 deletions.
90 changes: 0 additions & 90 deletions phpstan-baseline.neon
Original file line number Diff line number Diff line change
Expand Up @@ -2825,91 +2825,6 @@ parameters:
count: 1
path: src/PhpSpreadsheet/Reader/Csv/Delimiter.php

-
message: "#^Property PhpOffice\\\\PhpSpreadsheet\\\\Reader\\\\Gnumeric\\:\\:\\$referenceHelper has no typehint specified\\.$#"
count: 1
path: src/PhpSpreadsheet/Reader/Gnumeric.php

-
message: "#^Parameter \\#1 \\$fp of function fread expects resource, resource\\|false given\\.$#"
count: 1
path: src/PhpSpreadsheet/Reader/Gnumeric.php

-
message: "#^Parameter \\#1 \\$fp of function fclose expects resource, resource\\|false given\\.$#"
count: 1
path: src/PhpSpreadsheet/Reader/Gnumeric.php

-
message: "#^Property PhpOffice\\\\PhpSpreadsheet\\\\Reader\\\\Gnumeric\\:\\:\\$mappings has no typehint specified\\.$#"
count: 1
path: src/PhpSpreadsheet/Reader/Gnumeric.php

-
message: "#^Offset 'No' does not exist on SimpleXMLElement\\|null\\.$#"
count: 2
path: src/PhpSpreadsheet/Reader/Gnumeric.php

-
message: "#^Offset 'Unit' does not exist on SimpleXMLElement\\|null\\.$#"
count: 2
path: src/PhpSpreadsheet/Reader/Gnumeric.php

-
message: "#^Cannot call method setWidth\\(\\) on PhpOffice\\\\PhpSpreadsheet\\\\Worksheet\\\\ColumnDimension\\|null\\.$#"
count: 3
path: src/PhpSpreadsheet/Reader/Gnumeric.php

-
message: "#^Cannot call method setVisible\\(\\) on PhpOffice\\\\PhpSpreadsheet\\\\Worksheet\\\\ColumnDimension\\|null\\.$#"
count: 1
path: src/PhpSpreadsheet/Reader/Gnumeric.php

-
message: "#^Offset 'DefaultSizePts' does not exist on SimpleXMLElement\\|null\\.$#"
count: 2
path: src/PhpSpreadsheet/Reader/Gnumeric.php

-
message: "#^Cannot call method setRowHeight\\(\\) on PhpOffice\\\\PhpSpreadsheet\\\\Worksheet\\\\RowDimension\\|null\\.$#"
count: 2
path: src/PhpSpreadsheet/Reader/Gnumeric.php

-
message: "#^Cannot call method setVisible\\(\\) on PhpOffice\\\\PhpSpreadsheet\\\\Worksheet\\\\RowDimension\\|null\\.$#"
count: 1
path: src/PhpSpreadsheet/Reader/Gnumeric.php

-
message: "#^Method PhpOffice\\\\PhpSpreadsheet\\\\Reader\\\\Gnumeric\\:\\:parseBorderAttributes\\(\\) has no return typehint specified\\.$#"
count: 1
path: src/PhpSpreadsheet/Reader/Gnumeric.php

-
message: "#^Method PhpOffice\\\\PhpSpreadsheet\\\\Reader\\\\Gnumeric\\:\\:parseBorderAttributes\\(\\) has parameter \\$borderAttributes with no typehint specified\\.$#"
count: 1
path: src/PhpSpreadsheet/Reader/Gnumeric.php

-
message: "#^Method PhpOffice\\\\PhpSpreadsheet\\\\Reader\\\\Gnumeric\\:\\:parseRichText\\(\\) has no return typehint specified\\.$#"
count: 1
path: src/PhpSpreadsheet/Reader/Gnumeric.php

-
message: "#^Method PhpOffice\\\\PhpSpreadsheet\\\\Reader\\\\Gnumeric\\:\\:parseRichText\\(\\) has parameter \\$is with no typehint specified\\.$#"
count: 1
path: src/PhpSpreadsheet/Reader/Gnumeric.php

-
message: "#^Method PhpOffice\\\\PhpSpreadsheet\\\\Reader\\\\Gnumeric\\:\\:parseGnumericColour\\(\\) has no return typehint specified\\.$#"
count: 1
path: src/PhpSpreadsheet/Reader/Gnumeric.php

-
message: "#^Method PhpOffice\\\\PhpSpreadsheet\\\\Reader\\\\Gnumeric\\:\\:parseGnumericColour\\(\\) has parameter \\$gnmColour with no typehint specified\\.$#"
count: 1
path: src/PhpSpreadsheet/Reader/Gnumeric.php

-
message: "#^Property PhpOffice\\\\PhpSpreadsheet\\\\Reader\\\\Html\\:\\:\\$rowspan has no typehint specified\\.$#"
count: 1
Expand Down Expand Up @@ -8325,11 +8240,6 @@ parameters:
count: 1
path: tests/PhpSpreadsheetTests/Reader/CsvTest.php

-
message: "#^Cannot call method getVisible\\(\\) on PhpOffice\\\\PhpSpreadsheet\\\\Worksheet\\\\RowDimension\\|null\\.$#"
count: 1
path: tests/PhpSpreadsheetTests/Reader/Gnumeric/GnumericLoadTest.php

-
message: "#^Cannot call method getRowHeight\\(\\) on PhpOffice\\\\PhpSpreadsheet\\\\Worksheet\\\\RowDimension\\|null\\.$#"
count: 2
Expand Down
128 changes: 91 additions & 37 deletions src/PhpSpreadsheet/Reader/Gnumeric.php
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,19 @@

class Gnumeric extends BaseReader
{
private const UOM_CONVERSION_POINTS_TO_CENTIMETERS = 0.03527777778;
const NAMESPACE_GNM = 'http://www.gnumeric.org/v10.dtd'; // gmr in old sheets

const NAMESPACE_XSI = 'http://www.w3.org/2001/XMLSchema-instance';

const NAMESPACE_OFFICE = 'urn:oasis:names:tc:opendocument:xmlns:office:1.0';

const NAMESPACE_XLINK = 'http://www.w3.org/1999/xlink';

const NAMESPACE_DC = 'http://purl.org/dc/elements/1.1/';

const NAMESPACE_META = 'urn:oasis:names:tc:opendocument:xmlns:meta:1.0';

const NAMESPACE_OOO = 'http://openoffice.org/2004/office';

/**
* Shared Expressions.
Expand All @@ -41,16 +53,9 @@ class Gnumeric extends BaseReader
*/
private $spreadsheet;

/** @var ReferenceHelper */
private $referenceHelper;

/**
* Namespace shared across all functions.
* It is 'gnm', except for really old sheets which use 'gmr'.
*
* @var string
*/
private $gnm = 'gnm';

/**
* Create a new Gnumeric.
*/
Expand All @@ -77,16 +82,18 @@ public function canRead($pFilename)
if (function_exists('gzread')) {
// Read signature data (first 3 bytes)
$fh = fopen($pFilename, 'rb');
$data = fread($fh, 2);
fclose($fh);
if ($fh !== false) {
$data = fread($fh, 2);
fclose($fh);
}
}

return $data == chr(0x1F) . chr(0x8B);
}

private static function matchXml(string $name, string $field): bool
{
return 1 === preg_match("/^(gnm|gmr):$field$/", $name);
return 1 === preg_match("/:$field$/", $name);
}

/**
Expand Down Expand Up @@ -188,6 +195,7 @@ private function gzfileGetContents($filename)
return $data;
}

/** @var array */
private static $mappings = [
'borderStyle' => [
'0' => Border::BORDER_NONE,
Expand Down Expand Up @@ -266,7 +274,7 @@ public static function gnumericMappings(): array
private function processComments(SimpleXMLElement $sheet): void
{
if ((!$this->readDataOnly) && (isset($sheet->Objects))) {
foreach ($sheet->Objects->children($this->gnm, true) as $key => $comment) {
foreach ($sheet->Objects->children(self::NAMESPACE_GNM) as $key => $comment) {
$commentAttributes = $comment->attributes();
// Only comment objects are handled at the moment
if ($commentAttributes->Text) {
Expand All @@ -276,6 +284,14 @@ private function processComments(SimpleXMLElement $sheet): void
}
}

/**
* @param mixed $value
*/
private function testSimpleXml($value): SimpleXMLElement
{
return ($value instanceof SimpleXMLElement) ? $value : new SimpleXMLElement('<?xml version="1.0" encoding="UTF-8"?><root></root>');
}

/**
* Loads Spreadsheet from file.
*
Expand Down Expand Up @@ -304,12 +320,10 @@ public function loadIntoExisting(string $pFilename, Spreadsheet $spreadsheet): S
$gFileData = $this->gzfileGetContents($pFilename);

$xml2 = simplexml_load_string($this->securityScanner->scan($gFileData), 'SimpleXMLElement', Settings::getLibXmlLoaderOptions());
$xml = ($xml2 !== false) ? $xml2 : new SimpleXMLElement('<?xml version="1.0" encoding="UTF-8"?><root></root>');
$namespacesMeta = $xml->getNamespaces(true);
$this->gnm = array_key_exists('gmr', $namespacesMeta) ? 'gmr' : 'gnm';
$xml = self::testSimpleXml($xml2);

$gnmXML = $xml->children($namespacesMeta[$this->gnm]);
(new Properties($this->spreadsheet))->readProperties($xml, $gnmXML, $namespacesMeta);
$gnmXML = $xml->children(self::NAMESPACE_GNM);
(new Properties($this->spreadsheet))->readProperties($xml, $gnmXML);

$worksheetID = 0;
foreach ($gnmXML->Sheets->Sheet as $sheet) {
Expand All @@ -329,7 +343,7 @@ public function loadIntoExisting(string $pFilename, Spreadsheet $spreadsheet): S
$this->spreadsheet->getActiveSheet()->setTitle($worksheetName, false, false);

if (!$this->readDataOnly) {
(new PageSetup($this->spreadsheet, $this->gnm))
(new PageSetup($this->spreadsheet))
->printInformation($sheet)
->sheetMargins($sheet);
}
Expand Down Expand Up @@ -510,21 +524,37 @@ private function processMergedCells(SimpleXMLElement $sheet): void
}
}

private function setColumnWidth(int $c, float $defaultWidth): void
{
$colDim = $this->spreadsheet->getActiveSheet()->getColumnDimension(Coordinate::stringFromColumnIndex($c + 1));
if ($colDim !== null) {
$colDim->setWidth($defaultWidth);
}
}

private function setColumnInvisible(int $c): void
{
$colDim = $this->spreadsheet->getActiveSheet()->getColumnDimension(Coordinate::stringFromColumnIndex($c + 1));
if ($colDim !== null) {
$colDim->setVisible(false);
}
}

private function processColumnLoop(int $c, int $maxCol, SimpleXMLElement $columnOverride, float $defaultWidth): int
{
$columnAttributes = $columnOverride->attributes();
$columnAttributes = self::testSimpleXml($columnOverride->attributes());
$column = $columnAttributes['No'];
$columnWidth = ((float) $columnAttributes['Unit']) / 5.4;
$hidden = (isset($columnAttributes['Hidden'])) && ((string) $columnAttributes['Hidden'] == '1');
$columnCount = (int) ($columnAttributes['Count'] ?? 1);
while ($c < $column) {
$this->spreadsheet->getActiveSheet()->getColumnDimension(Coordinate::stringFromColumnIndex($c + 1))->setWidth($defaultWidth);
$this->setColumnWidth($c, $defaultWidth);
++$c;
}
while (($c < ($column + $columnCount)) && ($c <= $maxCol)) {
$this->spreadsheet->getActiveSheet()->getColumnDimension(Coordinate::stringFromColumnIndex($c + 1))->setWidth($columnWidth);
$this->setColumnWidth($c, $columnWidth);
if ($hidden) {
$this->spreadsheet->getActiveSheet()->getColumnDimension(Coordinate::stringFromColumnIndex($c + 1))->setVisible(false);
self::setColumnInvisible($c);
}
++$c;
}
Expand All @@ -536,35 +566,54 @@ private function processColumnWidths(SimpleXMLElement $sheet, int $maxCol): void
{
if ((!$this->readDataOnly) && (isset($sheet->Cols))) {
// Column Widths
$defaultWidth = 0;
$columnAttributes = $sheet->Cols->attributes();
$defaultWidth = $columnAttributes['DefaultSizePts'] / 5.4;
if ($columnAttributes !== null) {
$defaultWidth = $columnAttributes['DefaultSizePts'] / 5.4;
}
$c = 0;
foreach ($sheet->Cols->ColInfo as $columnOverride) {
$c = $this->processColumnLoop($c, $maxCol, $columnOverride, $defaultWidth);
}
while ($c <= $maxCol) {
$this->spreadsheet->getActiveSheet()->getColumnDimension(Coordinate::stringFromColumnIndex($c + 1))->setWidth($defaultWidth);
$this->setColumnWidth($c, $defaultWidth);
++$c;
}
}
}

private function setRowHeight(int $r, float $defaultHeight): void
{
$rowDim = $this->spreadsheet->getActiveSheet()->getRowDimension($r);
if ($rowDim !== null) {
$rowDim->setRowHeight($defaultHeight);
}
}

private function setRowInvisible(int $r): void
{
$rowDim = $this->spreadsheet->getActiveSheet()->getRowDimension($r);
if ($rowDim !== null) {
$rowDim->setVisible(false);
}
}

private function processRowLoop(int $r, int $maxRow, SimpleXMLElement $rowOverride, float $defaultHeight): int
{
$rowAttributes = $rowOverride->attributes();
$rowAttributes = self::testSimpleXml($rowOverride->attributes());
$row = $rowAttributes['No'];
$rowHeight = (float) $rowAttributes['Unit'];
$hidden = (isset($rowAttributes['Hidden'])) && ((string) $rowAttributes['Hidden'] == '1');
$rowCount = (int) ($rowAttributes['Count'] ?? 1);
while ($r < $row) {
++$r;
$this->spreadsheet->getActiveSheet()->getRowDimension($r)->setRowHeight($defaultHeight);
$this->setRowHeight($r, $defaultHeight);
}
while (($r < ($row + $rowCount)) && ($r < $maxRow)) {
++$r;
$this->spreadsheet->getActiveSheet()->getRowDimension($r)->setRowHeight($rowHeight);
$this->setRowHeight($r, $rowHeight);
if ($hidden) {
$this->spreadsheet->getActiveSheet()->getRowDimension($r)->setVisible(false);
$this->setRowInvisible($r);
}
}

Expand All @@ -575,8 +624,11 @@ private function processRowHeights(SimpleXMLElement $sheet, int $maxRow): void
{
if ((!$this->readDataOnly) && (isset($sheet->Rows))) {
// Row Heights
$defaultHeight = 0;
$rowAttributes = $sheet->Rows->attributes();
$defaultHeight = (float) $rowAttributes['DefaultSizePts'];
if ($rowAttributes !== null) {
$defaultHeight = (float) $rowAttributes['DefaultSizePts'];
}
$r = 0;

foreach ($sheet->Rows->RowInfo as $rowOverride) {
Expand Down Expand Up @@ -639,27 +691,29 @@ private static function addStyle2(array &$styleArray, string $key1, string $key,
}
}

private static function parseBorderAttributes($borderAttributes)
private static function parseBorderAttributes(?SimpleXMLElement $borderAttributes): array
{
$styleArray = [];
if (isset($borderAttributes['Color'])) {
$styleArray['color']['rgb'] = self::parseGnumericColour($borderAttributes['Color']);
}
if ($borderAttributes !== null) {
if (isset($borderAttributes['Color'])) {
$styleArray['color']['rgb'] = self::parseGnumericColour($borderAttributes['Color']);
}

self::addStyle($styleArray, 'borderStyle', $borderAttributes['Style']);
self::addStyle($styleArray, 'borderStyle', $borderAttributes['Style']);
}

return $styleArray;
}

private function parseRichText($is)
private function parseRichText(string $is): RichText
{
$value = new RichText();
$value->createText($is);

return $value;
}

private static function parseGnumericColour($gnmColour)
private static function parseGnumericColour(string $gnmColour): string
{
[$gnmR, $gnmG, $gnmB] = explode(':', $gnmColour);
$gnmR = substr(str_pad($gnmR, 4, '0', STR_PAD_RIGHT), 0, 2);
Expand Down
Loading

0 comments on commit 84096ae

Please sign in to comment.