Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Xpath position function not working properly #40

Open
carlows opened this issue Mar 2, 2021 · 3 comments
Open

Xpath position function not working properly #40

carlows opened this issue Mar 2, 2021 · 3 comments

Comments

@carlows
Copy link

carlows commented Mar 2, 2021

Hi there,

First of all, thank you for the packages, they're very useful 🚀

I've been having issues with the position function and I'm not sure if it's an issue with the htmlquery package or the xpath package, here's an example:

const htmlSample = `<!DOCTYPE html><html lang="en-US">
<head>
<title>Hello,World!</title>
</head>
<body>
<div class="test">
	<a href="/test1">Test 1</a>
</div>
<div class="test">
	<a href="/test2">Test 1</a>
</div>
<div class="test">
	<a href="/test3">Test 1</a>
</div>
</body>
</html>
`

func TestXPath(t *testing.T) {
	list := Find(testDoc, "//div[@class=\"test\" and position()=1]//a/@href")
	for _, n := range list {
		fmt.Println(InnerText(n))
	}
}

I would expect this to filter all the nodes that have the class test and have a position == 1, so only the first <a /> element. But instead, I get all the nodes. If I try position()=2 I get nothing back.

If I instead use this xpath, it gives me the correct element:

//div[@class=\"test\"][2]//a/@href

If I try this on the browser it works, so I'm not sure if it is expected that it works this way here 🤔.

What could be the problem?
Thank you again!

@zhengchun
Copy link
Contributor

thanks for report. it is a bug of position() in logical operation.

@carlows
Copy link
Author

carlows commented Mar 2, 2021

Maybe I can help debugging it and open a PR, do you have an idea of where to look in the code?

@zhengchun
Copy link
Contributor

Sorry, it was not about position() bug, I found if change expr to this list := htmlquery.Find(doc, "//div[position()=1 and @class=\"test\"]/a/@href") can works. I guess build.processNode() have some bug.

If you are interesting, you can start at func (l *logicalQuery) Evaluate(t iterator) interface{} {...} to debug.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants