Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gnumeric Better Namespace Handling #2022

Merged
merged 7 commits into from
May 4, 2021
Merged

Conversation

oleibman
Copy link
Collaborator

There have been a number of issues concerning the handling of legitimate but unexpected namespace prefixes in Xlsx spreadsheets created by software other than Excel and PhpSpreadsheet/PhpExcel.I have studied them, but, till now, have not had a good idea on how to act on them. A recent comment #860 (comment) in issue #860 by @IMSoP has triggered an idea about how to proceed.

Although the issues exclusively concern Xlsx format, I am starting out by dealing with Gnumeric. It is simpler and smaller than Xlsx, and, more important, already has a test for an unexpected prefix, since, at some point, it changed its generic prefix from gmr to gnm. I added support and a test for that some time ago, but almost certainly not in the best possible manner. The code as changed for this PR seems simpler and less kludgey, both for that exceptional case as well as for normal handling.

My hope is that this change can be a template for similar Reader changes for Xml, Ods, and, especially, Xlsx.

All grandfathered Phpstan issues with Gnumeric are fixed and eliminated from baseline as part of this change.

This is:

- [x] a bugfix
- [ ] a new feature

Checklist:

Why this change is needed?

There have been a number of issues concerning the handling of legitimate but unexpected namespace prefixes in Xlsx spreadsheets created by software other than Excel and PhpSpreadsheet/PhpExcel.I have studied them, but, till now, have not had a good idea on how to act on them. A recent comment PHPOffice#860 (comment) in issue PHPOffice#860 by @IMSoP has triggered an idea about how to proceed.

Although the issues exclusively concern Xlsx format, I am starting out by dealing with Gnumeric. It is simpler and smaller than Xlsx, and, more important, already has a test for an unexpected prefix, since, at some point, it changed its generic prefix from gmr to gnm. I added support and a test for that some time ago, but almost certainly not in the best possible manner. The code as changed for this PR seems simpler and less kludgey, both for that exceptional case as well as for normal handling.

My hope is that this change can be a template for similar Reader changes for Xml, Ods, and, especially, Xlsx.

All grandfathered Phpstan issues with Gnumeric are fixed and eliminated from baseline as part of this change.
Static vs. non-static in 5 places.
Copy link

@IMSoP IMSoP left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm glad my comment was helpful (and hope it didn't sound too critical; XML namespaces are confusing). :)

I did a quick experiment with XlsxReader the other day, and found that a number of places use $foo['bar'] to access attributes, and fail if you select the namespace explicitly. They'll need to be tracked down and either converted to $foo->attributes()->bar, or have $foo = $foo->attributes() added above.

(The reason has to do with an oddity in the XML Namespaces standard, where an element with no prefix is in the default namespace, but an attribute with no prefix has no defined meaning.)

src/PhpSpreadsheet/Reader/Gnumeric.php Outdated Show resolved Hide resolved
Adopt a suggestion from @IMSoP affecting listWorkSheetInfo, which uses XMLReader rather than SimpleXML for its processing.
PR PHPOffice#2024 was pushed last night, causing a Phpstan problem with this member.
Copy link
Member

@MarkBaker MarkBaker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With all tests passing, it looks like the new approach to namespacing i working remarkably well

src/PhpSpreadsheet/Reader/Gnumeric.php Outdated Show resolved Hide resolved
src/PhpSpreadsheet/Reader/Gnumeric.php Outdated Show resolved Hide resolved
src/PhpSpreadsheet/Reader/Gnumeric.php Show resolved Hide resolved
Suggestions from Mark Baker - strict equality test, more descriptive variable names.
@MarkBaker
Copy link
Member

@IMSoP Critical, not nearly as critical as the people that demand that we solve the namespace problems without ever making any suggestions about how to resolve it... thanks for the guidance on how it should be done

@MarkBaker MarkBaker merged commit 4be9366 into PHPOffice:master May 4, 2021
@oleibman oleibman deleted the namegnu branch July 1, 2021 15:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants