Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

encoding/xml: support xmlns prefixes #48641

Open
wants to merge 5 commits into
base: master
Choose a base branch
from
Open

Conversation

ydnar
Copy link

@ydnar ydnar commented Sep 27, 2021

Allow round-trip marshaling and unmarshaling with explicit namespace
prefixes. For example, this can be unmarshalled and re-marshalled
into this precise XML:

<epp xmlns="urn:ietf:params:xml:ns:epp-1.0">
        <command>
                <check>
                        <domain:check xmlns:domain="urn:ietf:params:xml:ns:domain-1.0">
                                <domain:name>golang.org</domain:name>
                                <domain:name>go.dev</domain:name>
                        </domain:check>
                </check>
        </command>
</epp>

For marshaling, a preferred namespace prefix can now be specified in a
struct tag or XMLName value by prefixing the local name:

xml:"urn:ietf:params:xml:ns:domain-1.0 domain:check"

Namespaced tag and attribute names are now strictly parsed and will
fail with an error if any are malformed, such as having a leading or
trailing colon, or more than 1 colon.

This change directly affects the tests introduced in 4d014e7
(https://golang.org/cl/277892). This may have security implications.
See #8068 and #43168.

An example playground that would be fixed with this change:

https://play.golang.org/p/-6Ee8tcLl2L

This is a rebase of https://golang.org/cl/11635 onto the current master branch.
It was originally written by Constantin Konstantinidis, Roger Peppe, and
Martin Hilton.

Original CL message:

When an xmlns="..." attribute was explicitly generated, it was being
ignored because the name space on the attribute was assumed to have been
explicitly set (to the empty name space) and it's not possible to have
an element in the empty name space when there is a non-empty namespace
set.

We fix this by recording when a default namespace has been explicitly
set and setting the name space of the element to that so
printer.defineNS can do its work correctly.

We do not attempt to add our own xmlns="..." attribute when one is
explicitly set.

We also add tests for EncodeElement, as that's the only way to attain
coverage of some of the changed behaviour. Some other test coverage is
also increased, although more work remains to be done in this area.

This change was jointly developed with Martin Hilton (@mhilton on
github).

Updates #11431
Updates #8068

@google-cla
Copy link

google-cla bot commented Sep 27, 2021

All (the pull request submitter and all commit authors) CLAs are signed, but one or more commits were authored or co-authored by someone other than the pull request submitter.

We need to confirm that all authors are ok with their commits being contributed to this project. Please have them confirm that by leaving a comment that contains only @googlebot I consent. in this pull request.

Note to project maintainer: There may be cases where the author cannot leave a comment, or the comment is not properly detected as consent. In those cases, you can manually confirm consent of the commit author(s), and set the cla label to yes (if enabled on your project).

ℹ️ Googlers: Go here for more info.

@google-cla google-cla bot added the cla: no Used by googlebot to label PRs as having an invalid CLA. The text of this label should not change. label Sep 27, 2021
@google-cla
Copy link

google-cla bot commented Sep 27, 2021

All (the pull request submitter and all commit authors) CLAs are signed, but one or more commits were authored or co-authored by someone other than the pull request submitter.

We need to confirm that all authors are ok with their commits being contributed to this project. Please have them confirm that by leaving a comment that contains only @googlebot I consent. in this pull request.

Note to project maintainer: There may be cases where the author cannot leave a comment, or the comment is not properly detected as consent. In those cases, you can manually confirm consent of the commit author(s), and set the cla label to yes (if enabled on your project).

ℹ️ Googlers: Go here for more info.

2 similar comments
@google-cla
Copy link

google-cla bot commented Sep 27, 2021

All (the pull request submitter and all commit authors) CLAs are signed, but one or more commits were authored or co-authored by someone other than the pull request submitter.

We need to confirm that all authors are ok with their commits being contributed to this project. Please have them confirm that by leaving a comment that contains only @googlebot I consent. in this pull request.

Note to project maintainer: There may be cases where the author cannot leave a comment, or the comment is not properly detected as consent. In those cases, you can manually confirm consent of the commit author(s), and set the cla label to yes (if enabled on your project).

ℹ️ Googlers: Go here for more info.

@google-cla
Copy link

google-cla bot commented Sep 27, 2021

All (the pull request submitter and all commit authors) CLAs are signed, but one or more commits were authored or co-authored by someone other than the pull request submitter.

We need to confirm that all authors are ok with their commits being contributed to this project. Please have them confirm that by leaving a comment that contains only @googlebot I consent. in this pull request.

Note to project maintainer: There may be cases where the author cannot leave a comment, or the comment is not properly detected as consent. In those cases, you can manually confirm consent of the commit author(s), and set the cla label to yes (if enabled on your project).

ℹ️ Googlers: Go here for more info.

@google-cla
Copy link

google-cla bot commented Sep 27, 2021

All (the pull request submitter and all commit authors) CLAs are signed, but one or more commits were authored or co-authored by someone other than the pull request submitter.

We need to confirm that all authors are ok with their commits being contributed to this project. Please have them confirm that by leaving a comment that contains only @googlebot I consent. in this pull request.

Note to project maintainer: There may be cases where the author cannot leave a comment, or the comment is not properly detected as consent. In those cases, you can manually confirm consent of the commit author(s), and set the cla label to yes (if enabled on your project).

ℹ️ Googlers: Go here for more info.

@ydnar
Copy link
Author

ydnar commented Sep 27, 2021

👋 @rogpeppe I hope this was OK!

@google-cla
Copy link

google-cla bot commented Oct 4, 2021

All (the pull request submitter and all commit authors) CLAs are signed, but one or more commits were authored or co-authored by someone other than the pull request submitter.

We need to confirm that all authors are ok with their commits being contributed to this project. Please have them confirm that by leaving a comment that contains only @googlebot I consent. in this pull request.

Note to project maintainer: There may be cases where the author cannot leave a comment, or the comment is not properly detected as consent. In those cases, you can manually confirm consent of the commit author(s), and set the cla label to yes (if enabled on your project).

ℹ️ Googlers: Go here for more info.

@ydnar
Copy link
Author

ydnar commented Oct 4, 2021

Just pushed an additional commit that allows callers to specify a preferred XML name space prefix. For example, this XML can now be unmarshalled and re-marshalled byte for byte:

<epp xmlns="urn:ietf:params:xml:ns:epp-1.0">
        <command>
                <check>
                        <domain:check xmlns:domain="urn:ietf:params:xml:ns:domain-1.0">
                                <domain:name>golang.org</domain:name>
                                <domain:name>go.dev</domain:name>
                        </domain:check>
                </check>
        </command>
</epp>

@rogpeppe
Copy link
Contributor

rogpeppe commented Oct 4, 2021

@ydnar Thanks for lifting this from the archives.

Unfortunately, as hinted at here I think that it's not enough to just resurrect the old fixes. We'd need a proposal for the changes and some discussion around them, because making these fixes will inevitably break some clients even as they fix others.

@ydnar
Copy link
Author

ydnar commented Oct 4, 2021

@ydnar Thanks for lifting this from the archives.

Unfortunately, as hinted at here I think that it's not enough to just resurrect the old fixes. We'd need a proposal for the changes and some discussion around them, because making these fixes will inevitably break some clients even as they fix others.

You're probably right. I guess my main question would be "is it worth it?" Given the alternative is to fork and vendor the encoding/xml package, which is relatively small and self-contained.

Would contributed broader test coverage help?

@google-cla
Copy link

google-cla bot commented Oct 5, 2021

All (the pull request submitter and all commit authors) CLAs are signed, but one or more commits were authored or co-authored by someone other than the pull request submitter.

We need to confirm that all authors are ok with their commits being contributed to this project. Please have them confirm that by leaving a comment that contains only @googlebot I consent. in this pull request.

Note to project maintainer: There may be cases where the author cannot leave a comment, or the comment is not properly detected as consent. In those cases, you can manually confirm consent of the commit author(s), and set the cla label to yes (if enabled on your project).

ℹ️ Googlers: Go here for more info.

8 similar comments
@google-cla
Copy link

google-cla bot commented Oct 5, 2021

All (the pull request submitter and all commit authors) CLAs are signed, but one or more commits were authored or co-authored by someone other than the pull request submitter.

We need to confirm that all authors are ok with their commits being contributed to this project. Please have them confirm that by leaving a comment that contains only @googlebot I consent. in this pull request.

Note to project maintainer: There may be cases where the author cannot leave a comment, or the comment is not properly detected as consent. In those cases, you can manually confirm consent of the commit author(s), and set the cla label to yes (if enabled on your project).

ℹ️ Googlers: Go here for more info.

@google-cla
Copy link

google-cla bot commented Oct 5, 2021

All (the pull request submitter and all commit authors) CLAs are signed, but one or more commits were authored or co-authored by someone other than the pull request submitter.

We need to confirm that all authors are ok with their commits being contributed to this project. Please have them confirm that by leaving a comment that contains only @googlebot I consent. in this pull request.

Note to project maintainer: There may be cases where the author cannot leave a comment, or the comment is not properly detected as consent. In those cases, you can manually confirm consent of the commit author(s), and set the cla label to yes (if enabled on your project).

ℹ️ Googlers: Go here for more info.

@google-cla
Copy link

google-cla bot commented Oct 5, 2021

All (the pull request submitter and all commit authors) CLAs are signed, but one or more commits were authored or co-authored by someone other than the pull request submitter.

We need to confirm that all authors are ok with their commits being contributed to this project. Please have them confirm that by leaving a comment that contains only @googlebot I consent. in this pull request.

Note to project maintainer: There may be cases where the author cannot leave a comment, or the comment is not properly detected as consent. In those cases, you can manually confirm consent of the commit author(s), and set the cla label to yes (if enabled on your project).

ℹ️ Googlers: Go here for more info.

@google-cla
Copy link

google-cla bot commented Oct 5, 2021

All (the pull request submitter and all commit authors) CLAs are signed, but one or more commits were authored or co-authored by someone other than the pull request submitter.

We need to confirm that all authors are ok with their commits being contributed to this project. Please have them confirm that by leaving a comment that contains only @googlebot I consent. in this pull request.

Note to project maintainer: There may be cases where the author cannot leave a comment, or the comment is not properly detected as consent. In those cases, you can manually confirm consent of the commit author(s), and set the cla label to yes (if enabled on your project).

ℹ️ Googlers: Go here for more info.

@google-cla
Copy link

google-cla bot commented Oct 5, 2021

All (the pull request submitter and all commit authors) CLAs are signed, but one or more commits were authored or co-authored by someone other than the pull request submitter.

We need to confirm that all authors are ok with their commits being contributed to this project. Please have them confirm that by leaving a comment that contains only @googlebot I consent. in this pull request.

Note to project maintainer: There may be cases where the author cannot leave a comment, or the comment is not properly detected as consent. In those cases, you can manually confirm consent of the commit author(s), and set the cla label to yes (if enabled on your project).

ℹ️ Googlers: Go here for more info.

@google-cla
Copy link

google-cla bot commented Oct 6, 2021

All (the pull request submitter and all commit authors) CLAs are signed, but one or more commits were authored or co-authored by someone other than the pull request submitter.

We need to confirm that all authors are ok with their commits being contributed to this project. Please have them confirm that by leaving a comment that contains only @googlebot I consent. in this pull request.

Note to project maintainer: There may be cases where the author cannot leave a comment, or the comment is not properly detected as consent. In those cases, you can manually confirm consent of the commit author(s), and set the cla label to yes (if enabled on your project).

ℹ️ Googlers: Go here for more info.

@google-cla
Copy link

google-cla bot commented Oct 6, 2021

All (the pull request submitter and all commit authors) CLAs are signed, but one or more commits were authored or co-authored by someone other than the pull request submitter.

We need to confirm that all authors are ok with their commits being contributed to this project. Please have them confirm that by leaving a comment that contains only @googlebot I consent. in this pull request.

Note to project maintainer: There may be cases where the author cannot leave a comment, or the comment is not properly detected as consent. In those cases, you can manually confirm consent of the commit author(s), and set the cla label to yes (if enabled on your project).

ℹ️ Googlers: Go here for more info.

@google-cla
Copy link

google-cla bot commented Oct 6, 2021

All (the pull request submitter and all commit authors) CLAs are signed, but one or more commits were authored or co-authored by someone other than the pull request submitter.

We need to confirm that all authors are ok with their commits being contributed to this project. Please have them confirm that by leaving a comment that contains only @googlebot I consent. in this pull request.

Note to project maintainer: There may be cases where the author cannot leave a comment, or the comment is not properly detected as consent. In those cases, you can manually confirm consent of the commit author(s), and set the cla label to yes (if enabled on your project).

ℹ️ Googlers: Go here for more info.

@ydnar
Copy link
Author

ydnar commented Oct 6, 2021

@ydnar Thanks for lifting this from the archives.

Unfortunately, as hinted at here I think that it's not enough to just resurrect the old fixes. We'd need a proposal for the changes and some discussion around them, because making these fixes will inevitably break some clients even as they fix others.

@rogpeppe thanks for the nudge. I made a proposal here: #48821

ydnar added a commit to domainr/epp that referenced this pull request Oct 6, 2021
@google-cla
Copy link

google-cla bot commented Oct 6, 2021

All (the pull request submitter and all commit authors) CLAs are signed, but one or more commits were authored or co-authored by someone other than the pull request submitter.

We need to confirm that all authors are ok with their commits being contributed to this project. Please have them confirm that by leaving a comment that contains only @googlebot I consent. in this pull request.

Note to project maintainer: There may be cases where the author cannot leave a comment, or the comment is not properly detected as consent. In those cases, you can manually confirm consent of the commit author(s), and set the cla label to yes (if enabled on your project).

ℹ️ Googlers: Go here for more info.

2 similar comments
@google-cla
Copy link

google-cla bot commented Oct 6, 2021

All (the pull request submitter and all commit authors) CLAs are signed, but one or more commits were authored or co-authored by someone other than the pull request submitter.

We need to confirm that all authors are ok with their commits being contributed to this project. Please have them confirm that by leaving a comment that contains only @googlebot I consent. in this pull request.

Note to project maintainer: There may be cases where the author cannot leave a comment, or the comment is not properly detected as consent. In those cases, you can manually confirm consent of the commit author(s), and set the cla label to yes (if enabled on your project).

ℹ️ Googlers: Go here for more info.

@google-cla
Copy link

google-cla bot commented Oct 6, 2021

All (the pull request submitter and all commit authors) CLAs are signed, but one or more commits were authored or co-authored by someone other than the pull request submitter.

We need to confirm that all authors are ok with their commits being contributed to this project. Please have them confirm that by leaving a comment that contains only @googlebot I consent. in this pull request.

Note to project maintainer: There may be cases where the author cannot leave a comment, or the comment is not properly detected as consent. In those cases, you can manually confirm consent of the commit author(s), and set the cla label to yes (if enabled on your project).

ℹ️ Googlers: Go here for more info.

@ydnar ydnar changed the title encoding/xml: fix xmlns= behavior encoding/xml: support xmlns prefixes Oct 7, 2021
@google-cla
Copy link

google-cla bot commented Oct 7, 2021

All (the pull request submitter and all commit authors) CLAs are signed, but one or more commits were authored or co-authored by someone other than the pull request submitter.

We need to confirm that all authors are ok with their commits being contributed to this project. Please have them confirm that by leaving a comment that contains only @googlebot I consent. in this pull request.

Note to project maintainer: There may be cases where the author cannot leave a comment, or the comment is not properly detected as consent. In those cases, you can manually confirm consent of the commit author(s), and set the cla label to yes (if enabled on your project).

ℹ️ Googlers: Go here for more info.

@ydnar
Copy link
Author

ydnar commented Oct 8, 2021

@iwdgo I think @google-cla might be referring to your early commit on this PR. You OK doing @googlebot I consent?

@thoro
Copy link

thoro commented Oct 8, 2023

Hi,

sorry for posting this here on github, but I have never used Gerrit and it's a bit unclear how todo what!

I just had to use this to produce correct XML for UPNP without resorting to attribute tricks, and it works amazingly.

Since the prefixes are now tracked, it would be logical to allow to define prefixes via tags, it's 5 additional lines of code in the writeStart function (L845):

	if prefix == xmlnsPrefix {
		if prefix, prefixCreated := p.createPrefix(attr.Name.Space, local); prefixCreated {
			p.writePrefixAttr(prefix, attr.Name.Space)
		}
		continue
	}

This would allow to define certain namespace at the top-level object and reuse them everywhere:

type DidlLite struct {
	XMLName xml.Name `xml:"urn:schemas-upnp-org:metadata-1-0/DIDL-Lite/ DIDL-Lite"`
	UPNP string `xml:"urn:schemas-upnp-org:metadata-1-0/upnp/ xmlns:upnp,attr"`
	DC string `xml:"http://purl.org/dc/elements/1.1/ xmlns:dc,attr"`
	Items []Item `xml:"urn:schemas-upnp-org:metadata-1-0/DIDL-Lite/ item"`
}

This would allow to make unmarshaled / marshaled xml to be exactly the same!

But I do understand it would be a bit of magic (using the prefix xmlns) and therefore maybe undesirable.

Edit: Seems I just need to read better, because it works like this already, and that makes more sense anyways:

type DidlLite struct {
	XMLName xml.Name `xml:"urn:schemas-upnp-org:metadata-1-0/DIDL-Lite/ DIDL-Lite"`
	UPNP string `xml:"http://www.w3.org/2000/xmlns/ xmlns:upnp,attr"`
	DC string `xml:"http://www.w3.org/2000/xmlns/ xmlns:dc,attr"`
	Items []Item `xml:"urn:schemas-upnp-org:metadata-1-0/DIDL-Lite/ item"`
}

@ydnar
Copy link
Author

ydnar commented Oct 8, 2023

@thoro happy to take a look at your proposal if you can share more context. Perhaps file an issue at @nbio/xml (the fork of encoding/xml I maintain)?

@thoro
Copy link

thoro commented Oct 8, 2023

@ydnar I guess you didn‘t see my edit, but I will suggest it there.

it would just be syntactic sugar, so you won‘t have to initialize the fields of the struct with the correct namespace.

@gopherbot
Copy link
Contributor

This PR (HEAD: 47c2972) has been imported to Gerrit for code review.

Please visit Gerrit at https://go-review.googlesource.com/c/go/+/355353.

Important tips:

  • Don't comment on this PR. All discussion takes place in Gerrit.
  • You need a Gmail or other Google account to log in to Gerrit.
  • To change your code in response to feedback:
    • Push a new commit to the branch used by your GitHub PR.
    • A new "patch set" will then appear in Gerrit.
    • Respond to each comment by marking as Done in Gerrit if implemented as suggested. You can alternatively write a reply.
    • Critical: you must click the blue Reply button near the top to publish your Gerrit responses.
    • Multiple commits in the PR will be squashed by GerritBot.
  • The title and description of the GitHub PR are used to construct the final commit message.
    • Edit these as needed via the GitHub web interface (not via Gerrit or git).
    • You should word wrap the PR description at ~76 characters unless you need longer lines (e.g., for tables or URLs).
  • See the Sending a change via GitHub and Reviews sections of the Contribution Guide as well as the FAQ for details.

Copy link
Contributor

@DemiMarie DemiMarie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you also check for #68294, #68295, #68296, and #68297? These all include test strings.

Comment on lines +1190 to +1206
n := strings.Count(s, ":")
if n == 0 { // No colons, no namespace. OK.
name.Local = s
} else if n > 1 { // More than one colon, not OK.
name.Local = s
return name, false
} else if space, local, ok := strings.Cut(s, ":"); !ok || space == "" || local == "" {
} else if i := strings.Index(s, ":"); i < 1 || i > len(s)-2 { // Leading or trailing colon, not OK.
name.Local = s
return name, false
} else {
name.Space = space
name.Local = local
name.Space = s[0:i]
name.Local = s[i+1:]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
n := strings.Count(s, ":")
if n == 0 { // No colons, no namespace. OK.
name.Local = s
} else if n > 1 { // More than one colon, not OK.
name.Local = s
return name, false
} else if space, local, ok := strings.Cut(s, ":"); !ok || space == "" || local == "" {
} else if i := strings.Index(s, ":"); i < 1 || i > len(s)-2 { // Leading or trailing colon, not OK.
name.Local = s
return name, false
} else {
name.Space = space
name.Local = local
name.Space = s[0:i]
name.Local = s[i+1:]
if i := strings.IndexByte(s, 58); i == -1 { // No colons, no namespace. OK.
name.Local = s
} else if i < 1 || i > len(s)-2 { // Leading or trailing colon, not OK.
name.Local = s
return name, false
} else {
local := s[i+1:]
if strings.IndexByte(local, 58) != -1 { // Second colon, not OK.
name.Local = s
return name, false
} else {
name.Space = s[0:i]
name.Local = local
}

Micro-optimization.

@gopherbot
Copy link
Contributor

This PR (HEAD: 5460d25) has been imported to Gerrit for code review.

Please visit Gerrit at https://go-review.googlesource.com/c/go/+/355353.

Important tips:

  • Don't comment on this PR. All discussion takes place in Gerrit.
  • You need a Gmail or other Google account to log in to Gerrit.
  • To change your code in response to feedback:
    • Push a new commit to the branch used by your GitHub PR.
    • A new "patch set" will then appear in Gerrit.
    • Respond to each comment by marking as Done in Gerrit if implemented as suggested. You can alternatively write a reply.
    • Critical: you must click the blue Reply button near the top to publish your Gerrit responses.
    • Multiple commits in the PR will be squashed by GerritBot.
  • The title and description of the GitHub PR are used to construct the final commit message.
    • Edit these as needed via the GitHub web interface (not via Gerrit or git).
    • You should word wrap the PR description at ~76 characters unless you need longer lines (e.g., for tables or URLs).
  • See the Sending a change via GitHub and Reviews sections of the Contribution Guide as well as the FAQ for details.

@gopherbot
Copy link
Contributor

This PR (HEAD: 8623889) has been imported to Gerrit for code review.

Please visit Gerrit at https://go-review.googlesource.com/c/go/+/355353.

Important tips:

  • Don't comment on this PR. All discussion takes place in Gerrit.
  • You need a Gmail or other Google account to log in to Gerrit.
  • To change your code in response to feedback:
    • Push a new commit to the branch used by your GitHub PR.
    • A new "patch set" will then appear in Gerrit.
    • Respond to each comment by marking as Done in Gerrit if implemented as suggested. You can alternatively write a reply.
    • Critical: you must click the blue Reply button near the top to publish your Gerrit responses.
    • Multiple commits in the PR will be squashed by GerritBot.
  • The title and description of the GitHub PR are used to construct the final commit message.
    • Edit these as needed via the GitHub web interface (not via Gerrit or git).
    • You should word wrap the PR description at ~76 characters unless you need longer lines (e.g., for tables or URLs).
  • See the Sending a change via GitHub and Reviews sections of the Contribution Guide as well as the FAQ for details.

@gopherbot
Copy link
Contributor

Message from Ian Lance Taylor:

Patch Set 29: Commit-Queue+1


Please don’t reply on this GitHub thread. Visit golang.org/cl/355353.
After addressing review feedback, remember to publish your drafts!

@gopherbot
Copy link
Contributor

Message from Go LUCI:

Patch Set 29:

Dry run: CV is trying the patch.

Bot data: {"action":"start","triggered_at":"2024-10-28T18:45:12Z","revision":"1b5153921cc73f9d5b8b48988e7b8cef2a8d61d6"}


Please don’t reply on this GitHub thread. Visit golang.org/cl/355353.
After addressing review feedback, remember to publish your drafts!

@gopherbot
Copy link
Contributor

Message from Ian Lance Taylor:

Patch Set 29: -Commit-Queue


Please don’t reply on this GitHub thread. Visit golang.org/cl/355353.
After addressing review feedback, remember to publish your drafts!

@gopherbot
Copy link
Contributor

Message from Go LUCI:

Patch Set 29:

This CL has failed the run. Reason:

Tryjob golang/try/x_tools-gotip-linux-amd64 has failed with summary (view all results):


Build or test failure, click here for results.

To reproduce, try gomote repro 8732809263035864545.

Additional links for debugging:


Please don’t reply on this GitHub thread. Visit golang.org/cl/355353.
After addressing review feedback, remember to publish your drafts!

@gopherbot
Copy link
Contributor

Message from Go LUCI:

Patch Set 29: LUCI-TryBot-Result-1


Please don’t reply on this GitHub thread. Visit golang.org/cl/355353.
After addressing review feedback, remember to publish your drafts!

ydnar and others added 5 commits October 28, 2024 14:56
All issues of golang#13400 which are not new functionalities have fixes.
There are minor incompatibilities between them due to the handling of
prefixes. Duplicating a prefix or an namespace is invalid XML.
This is now avoided. XML produced is always valid.

Tests have been added for each fix and example and previous tests
fixed as output is already more compact is some cases.

encoding/xml: fix duplication of namespace tags by encoder (golang#7535)

A tag prefix identifies the name space of the tag and not the default name
space like xmlns="...". Writing the prefix is incorrect when it is bound
to a name space using the standard xmlns:prefix="..." attribute.
This fix skips this print.

Consequences are
 - duplication is avoided in line with name space standard in reference.
 - the _xmlns declaration does not appear anymore. To keep the previous
behaviour, the prefix is printed in all other cases. Token now always
produces well-formed XML except when the explicit name space collides
with the prefix.

Made prefix handling "xmlns="space" xmlns:_xmlns="xmlns" _..." has
be removed in all wants of tests. In some cases, useless declarations like
xmlns:x="x" are still added in line with previous behavior.

encoding/xml: fix unexpected behavior of encoder.Indent("", "") (golang#13185, golang#11431)

MarshalIndent and Marshal share code. When prefix and indent are
empty, the behavior is like Marshal when it should have a minimal
indent, i.e. new line as in documentation.

A boolean is added to the local printer struct which defaults to false.
It is set to true only when MarshalIndent is used and prefix and indent
are empty.

encoding/xml: fix overriding by empty namespace (golang#7113)

The namespace defined by xmlns="value" can be overridden in
every included tag including by the empty namespace xmlns="".
Empty namespace is not authorized with a prefix (xmlns:ns="").

Method to calculate indent of XML handles depth of tag and
its associated namespace. The fix leaves the method active even
when no indent is required.

An XMLName field in a struct means that namespace must be
enforced even if empty. This occurs only on an inner tag as an
overwrite of any non-empty namespace of its outer tag.

To obtain the xmlns="" required, an attribute is added.

encoding/xml: fix panic on embedded unexported XMLName (golang#10538)

By laws of reflection, unexported fields are unreachable by .Value.
XMLName are allowed at any level of an inner struct but the struct
may not be reachable. If XMLName field is found without name,
it cannot be exported and the value is discarded like other fields.
Some comments have been to underline where the issue arises.
Another XMLName test was incorrectly set up and is fixed.
Various cases added in a specific test.

encoding/xml: fix panic of unmarshaling of anonymous structs (golang#16497)

Encoding/xml is using type "typinfo" which provides its own "value" method
using the reflection package. It fails to check for the documented possible
panics of the reflection methods. It is impossible to fix the method as the
parameter is also using reflection.

The fix is to discard anonymous structs which have no value anyway.
Encoder/xml documentation already mentions that fields are accessed using
the reflection package. A relevant test is added.

encoding/xml: fix closing tag failure (golang#20685)

Push/pop of elements name must be done using the eventual prefix together
with the tag name. The operation is moved before the translation of prefix
into its URI. One end element of a test is fixed as expecting the last used
namespace is incorrect.
After closing a tag using a namespace, the valid namespace must be taken
from the opening tag.

encoding/xml: add check of namespaces to detect field names conflicts (golang#8535, golang#11724)

Comparing namespaces of fields was missing and false conflicts were detected.

encoding/xml: fix invalid empty namespace without prefix (golang#8068)

Empty namespace is allowed only if it has no prefix. An error message is now
returned if a prefix exists. A similar case when no prefix is provided,
thus with syntax xmlns:="..." is also rejected.

encoding/xml: fix normalization of attributes values (golang#20614)

The attribute value was read as text. The existing attribute reader logic is
fixed as an attribute may have a namespace or only a prefix. Other
possibilities have been removed.

To keep the behavior of raw token which allows many faults in attributes list,
error handling is heavily using the Strict parameter of the decoder.
Specific tests have been added including list of attributes.

To keep the behavior or unmarshal, escaped characters handling has been
added but it is not symmetrical to Marshal for quotes but follows XML
specification.

encoding/xml: fix absence of detection of another : in qualified names (golang#20396)

The occurrence of second : in space and tag name is rejected.

Fixes: golang#7113, golang#7535, golang#8068, golang#8535, golang#10538, golang#11431, golang#13185, golang#16497, golang#20396, golang#20614, golang#20685

Change-Id: Ib4a60347a47d23ff59b63307cebb83b71c7c9165

Commit originally authored by: Constantin Konstantinidis <[email protected]>

encoding/xml: fix printing of namespace prefix in tag names

Prefix displays in XML when defined in tag names.
The URL of the name space is also returned for an End Token as
documentation requires to improve idempotency of Marshal/Unmarshal.

Translating the prefix is popping the NS which was unavailable
for translation. Translate for an End Token has been moved inside
the pop element part.

Fixes golang#9519

Change-Id: Id7a14d07c106a76a487b5c4e28d9d563fe061c60

encoding/xml: restore changes lost in merge

See golang#43168.

encoding/xml: gofmt

encoding/xml: minimalIndent -> minIndent to shrink diff

encoding/xml: gofmt

encoding/xml: revert changes to (*Decoder).attrval from master

encoding/xml: restore (*printer).writeIndent to master

encoding/xml: remove comments that don’t change functionality of tests

encoding/xml: use t.Errorf instead of fmt.Errorf

encoding/xml: fix test case for golang#11496

encoding/xml: revert unrelated change in (*Decoder).readName

encoding/xml: remove comments

encoding/xml: remove comments

encoding/xml: edit comments

encoding/xml: add test to ensure '@' is legal in an attr val

encoding/xml: improve test error quoting for readability

encoding/xml: (*Decoder).nsname now strictly parses namespaces

encoding/xml: disable 3/4 tests for golang#43168

I’m not sure what the right fix is for this. Malformed namespaces
(leading or trailing colons, more than 1 colon) should result in an error.

encoding/xml: allow callers to specify namespace prefix

A preferred namespace prefix can now be specified by setting the local
name space to "prefix:space".

In a struct tag:

struct Message {
    XMLName struct{} `xml:"urn:ietf:foo foo:message"`
    Value string `xml:"urn:ietf:foo foo:value"`
}

encoding/xml: use tag prefixes if previously defined

Instead of having redundant xmlns= attrs.

encoding/xml: rename (*printer).popPrefix to popPrefixes

since it pops 1 or more prefixes

encoding/xml: rename printer struct fields

To reflect the fact that prefixes apply to tag names, not just attrs.

encoding/xml: url->uri; reorg comments

encoding/xml: fix comments

encoding/xml: elide redundant xmlns= attrs for default namespace (no prefix)

encoding/xml: move namespace prefix tests

encoding/xml: handle namespace prefixes at the root element

encoding/xml: handle namespaced prefixed attr names

encoding/xml: split out prefixes during reflection, not parsing

encoding/xml: rename tagName to xmlTag; document it

encoding/xml: separate printing xmlns:prefix= attr from creating prefix

encoding/xml: clean up comments

encoding/xml: don’t use xmlTag in attr serialization

encoding/xml: test nested prefixed namespaced tags

encoding/xml: add test for prefixed XML tags without xmlns

encoding/xml: update doc comments

encoding/xml: update comments

encoding/xml: preemptively declare and record all namespace prefixes

Immediate children of XML tags that share a namespace and prefix will now
have that namespace prefix declared on the parent tag.

Explicitly declared namespace prefixes via xmlns:prefix="uri" attributes
will now have their values recorded and reused for marshalling child tags.

This commit reorganizes how namespace prefixes are recorded and detected,
along with how start tags are recorded via a slice of element structs.

Some tests are changed, because the value of the marshalled XML has changed,
but the resulting XML should be equivalent (and correct).

encoding/xml: restore tests with xmlns:prefix= attrs to existing behavior

Use xmlnsURL instead of xmlnsPrefix to specify namespace prefixes.

encoding/xml: prefer unprefixed names where possible

encoding/xml: add EncodeToken tests for explicit prefixes

encoding/xml: add test cases for prefixed local names

encoding/xml: revert change in 8f143f5277; do not preemptively declare XML namespaces

encoding/xml: fix typo in test
… Name

Addresses CL feedback from Adam Shannon.
@gopherbot
Copy link
Contributor

This PR (HEAD: a79fce7) has been imported to Gerrit for code review.

Please visit Gerrit at https://go-review.googlesource.com/c/go/+/355353.

Important tips:

  • Don't comment on this PR. All discussion takes place in Gerrit.
  • You need a Gmail or other Google account to log in to Gerrit.
  • To change your code in response to feedback:
    • Push a new commit to the branch used by your GitHub PR.
    • A new "patch set" will then appear in Gerrit.
    • Respond to each comment by marking as Done in Gerrit if implemented as suggested. You can alternatively write a reply.
    • Critical: you must click the blue Reply button near the top to publish your Gerrit responses.
    • Multiple commits in the PR will be squashed by GerritBot.
  • The title and description of the GitHub PR are used to construct the final commit message.
    • Edit these as needed via the GitHub web interface (not via Gerrit or git).
    • You should word wrap the PR description at ~76 characters unless you need longer lines (e.g., for tables or URLs).
  • See the Sending a change via GitHub and Reviews sections of the Contribution Guide as well as the FAQ for details.

@gopherbot
Copy link
Contributor

Message from Ian Lance Taylor:

Patch Set 30: Commit-Queue+1


Please don’t reply on this GitHub thread. Visit golang.org/cl/355353.
After addressing review feedback, remember to publish your drafts!

@gopherbot
Copy link
Contributor

Message from Go LUCI:

Patch Set 30:

Dry run: CV is trying the patch.

Bot data: {"action":"start","triggered_at":"2024-10-29T00:14:12Z","revision":"685244577409d3ba43eabaa835fabee8fd694f15"}


Please don’t reply on this GitHub thread. Visit golang.org/cl/355353.
After addressing review feedback, remember to publish your drafts!

@gopherbot
Copy link
Contributor

Message from Ian Lance Taylor:

Patch Set 30: -Commit-Queue


Please don’t reply on this GitHub thread. Visit golang.org/cl/355353.
After addressing review feedback, remember to publish your drafts!

@gopherbot
Copy link
Contributor

Message from Go LUCI:

Patch Set 30:

This CL has passed the run


Please don’t reply on this GitHub thread. Visit golang.org/cl/355353.
After addressing review feedback, remember to publish your drafts!

@gopherbot
Copy link
Contributor

Message from Go LUCI:

Patch Set 30: LUCI-TryBot-Result+1


Please don’t reply on this GitHub thread. Visit golang.org/cl/355353.
After addressing review feedback, remember to publish your drafts!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla: yes Used by googlebot to label PRs as having a valid CLA. The text of this label should not change.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants