Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot parse most sites in php #9

Closed
nerdunit opened this issue Jul 20, 2022 · 2 comments · Fixed by #10
Closed

Cannot parse most sites in php #9

nerdunit opened this issue Jul 20, 2022 · 2 comments · Fixed by #10

Comments

@nerdunit
Copy link

The html parser will print
"Unknown HTML symbol /
Unknown HTML symbol /
Unknown HTML symbol /
Unknown HTML symbol /
Unknown HTML symbol /
Unknown HTML symbol /
Unknown HTML symbol /
Unknown HTML symbol /
Unknown HTML symbol /
Unknown HTML symbol /
Unknown HTML symbol /
Unknown HTML symbol /"
On most php based website, example code

    let url = "https://www.ladybirdeducation.co.uk/";
    let client = reqwest::blocking::Client::new();
    let res = client
        .get(url).send()?;
    let body = res.text()?;
    println!("body:{}",(&body).as_str());
    let document = html::parse(body.as_str())?;
    println!("parsed");
    Ok(())
@James-LG
Copy link
Owner

PHP is rendered into HTML server-side so your error must be from something else. Regardless I will investigate based on the URL you gave.

James-LG added a commit that referenced this issue Jul 21, 2022
Script tags require special handling of triangle brackets to allow
them to be used as comparison operators in JS.

Closes #9
James-LG added a commit that referenced this issue Jul 21, 2022
Script tags require special handling of triangle brackets to allow
them to be used as comparison operators in JS.

Closes #9
@James-LG
Copy link
Owner

I believe your issue was actually related to a < symbol in a <script> tag. Fix is released in v0.3.1, reopen the issue if your problems persist.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants