Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Properly parse unquoted attributes (please🙏?) #40

Open
piccojs opened this issue Mar 21, 2022 · 3 comments
Open

Properly parse unquoted attributes (please🙏?) #40

piccojs opened this issue Mar 21, 2022 · 3 comments

Comments

@piccojs
Copy link

piccojs commented Mar 21, 2022

Whenever I parse this string:

<p type=bold>hello mom!</p>

I get unexpected results:

{
  tagName:"p",
  attributes:{
    type:null,
    bold:null
  },
  children:["hello mom!"]
}

I know this is an xml parser but wouldn't it just make sense to generate:

{
  tagName:"p",
  attributes:{
    type:"bold"
  },
  children:["hello mom!"]
}

Please consider adding such type parsing in the next bug fix, I love your library! 🥰🥰

@TobiasNickel
Copy link
Owner

TobiasNickel commented Mar 22, 2022

thanks, this really is a bug. I can confirm it, in a quick test.
it need to be fixed in the parseNode method between line 200 and 220:

tXml/tXml.js

Lines 199 to 224 in 0dddcb2

if ((c > 64 && c < 91) || (c > 96 && c < 123)) {
//if('abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ'.indexOf(S[pos])!==-1 ){
var name = parseName();
// search beginning of the string
var code = S.charCodeAt(pos);
while (code && code !== singleQuoteCC && code !== doubleQuoteCC && !((code > 64 && code < 91) || (code > 96 && code < 123)) && code !== closeBracketCC) {
pos++;
code = S.charCodeAt(pos);
}
if (code === singleQuoteCC || code === doubleQuoteCC) {
var value = parseString();
if (pos === -1) {
return {
tagName,
attributes,
children,
};
}
} else {
value = null;
pos--;
}
attributes[name] = value;
}
pos++;
}

To be honest, even I find the code at line 204 a bit difficult to read, But with spending some time we should be able to figure this out.

@tuananh
Copy link

tuananh commented Oct 13, 2022

but that one is invalid xml right? there's suppose to be quote (single or double) around bold?

<p type="bold">hello mom!</p>

@TobiasNickel
Copy link
Owner

right, but still many html pages do not have quotes. the values then can not contain any space.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants