Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

获取frame内的元素直接panic #541

Closed
heyuanyou opened this issue Feb 18, 2022 · 24 comments
Closed

获取frame内的元素直接panic #541

heyuanyou opened this issue Feb 18, 2022 · 24 comments
Labels
help wanted We wish someone can help us to work on it question Questions related to rod

Comments

@heyuanyou
Copy link

heyuanyou commented Feb 18, 2022

Rod Version: v0.102.1

// Please make sure your code is minimal and standalone
func main() {
defer func() {
		if i := recover(); i != nil {
			fmt.Println("panic:", i)
		}
		fmt.Println("app will quit in 10s...")
		time.Sleep(time.Second * 10)
	}()
	browser := rod.New().ControlURL("debugURL").MustConnect().NoDefaultDevice()
	pages := browser.MustPages()
	page := pages[0]
	fmt.Println("got page ", page.TargetID)
	page.MustNavigate("https://coinlist.co/login")
	page.MustWaitLoad()

	e := page.MustElement("div:not([style*='display:']) > iframe[data-hcaptcha-widget-id]")
	fmt.Println("element got")
	frame := e.MustFrame()
	fmt.Println("frame got ", frame.IsIframe())
	c, err := frame.Element("#checkbox")
	if err != nil {
		fmt.Println("query checkbox:", err)
		return
	}
	fmt.Println("element got")
	c.MustClick()
	select {}
}

运行结果

c, err := frame.Element("#checkbox")语句会直接空指针panic

@heyuanyou heyuanyou added the question Questions related to rod label Feb 18, 2022
@ysmood
Copy link
Member

ysmood commented Feb 19, 2022

报错的是第5行 page := pages[0]

@ysmood ysmood closed this as completed Feb 19, 2022
@heyuanyou
Copy link
Author

报错的是第5行 page := pages[0]

image

@ysmood
Copy link
Member

ysmood commented Feb 19, 2022

你确定你运行的代码和上诉一致吗?读你的代码,浏览器又没有建立新的页面,一个页面都没有,当然会报错

@heyuanyou
Copy link
Author

你确定你运行的代码和上诉一致吗?

我运行的只多了panic的defer,我修改下运行版本的code,你再试试

@heyuanyou
Copy link
Author

你确定你运行的代码和上诉一致吗?读你的代码,浏览器又没有建立新的页面,一个页面都没有,当然会报错

已更正代码,你看看呢

@ysmood
Copy link
Member

ysmood commented Feb 19, 2022

你先自己运行成功了再问吧,我无法运行你的代码

image

@heyuanyou
Copy link
Author

你先自己运行成功了再问吧,我无法运行你的代码

image

先手动开启个debug浏览器,把浏览器的debug url填入代码此行即可:browser := rod.New().ControlURL("debugURL").MustConnect().NoDefaultDevice()

@ysmood
Copy link
Member

ysmood commented Feb 19, 2022

依然无法复现问题,没有打印任何内容,我不知道为啥要写的如此复杂才行,所以我简化了下

package main

import (
	"fmt"

	"github.com/go-rod/rod"
)

func main() {
	page := rod.New().MustConnect().NoDefaultDevice().MustPage("https://coinlist.co/login")
	page.MustWaitLoad()

	e := page.MustElement("div:not([style*='display:']) > iframe[data-hcaptcha-widget-id]")
	fmt.Println("element got")
	frame := e.MustFrame()
	fmt.Println("frame got ", frame.IsIframe())
	c, err := frame.Element("#checkbox")
	if err != nil {
		fmt.Println("query checkbox:", err)
		return
	}
	fmt.Println("element got")
	c.MustClick()
	select {}
}

@heyuanyou
Copy link
Author

依然无法复现问题,没有打印任何内容

package main

import (
	"fmt"

	"github.com/go-rod/rod"
)

func main() {
	page := rod.New().MustConnect().NoDefaultDevice().MustPage("https://coinlist.co/login")
	page.MustWaitLoad()

	e := page.MustElement("div:not([style*='display:']) > iframe[data-hcaptcha-widget-id]")
	fmt.Println("element got")
	frame := e.MustFrame()
	fmt.Println("frame got ", frame.IsIframe())
	c, err := frame.Element("#checkbox")
	if err != nil {
		fmt.Println("query checkbox:", err)
		return
	}
	fmt.Println("element got")
	c.MustClick()
	select {}
}

稍等,我重新提供直接可运行的代码吧

@heyuanyou
Copy link
Author

依然无法复现问题,没有打印任何内容,我不知道为啥要写的如此复杂才行,所以我简化了下

package main

import (
	"fmt"

	"github.com/go-rod/rod"
)

func main() {
	page := rod.New().MustConnect().NoDefaultDevice().MustPage("https://coinlist.co/login")
	page.MustWaitLoad()

	e := page.MustElement("div:not([style*='display:']) > iframe[data-hcaptcha-widget-id]")
	fmt.Println("element got")
	frame := e.MustFrame()
	fmt.Println("frame got ", frame.IsIframe())
	c, err := frame.Element("#checkbox")
	if err != nil {
		fmt.Println("query checkbox:", err)
		return
	}
	fmt.Println("element got")
	c.MustClick()
	select {}
}

页面应该是直接到了login界面去了,这个测试需要进入到hcaptcha的验证界面(使用本地浏览器大概率不会出现hcaptcha的验证),不然程序会block住,就没有任何打印了。
我现在都是开了一个windows的云服务器,然后在里面复现的.

@heyuanyou
Copy link
Author

依然无法复现问题,没有打印任何内容,我不知道为啥要写的如此复杂才行,所以我简化了下

package main

import (
	"fmt"

	"github.com/go-rod/rod"
)

func main() {
	page := rod.New().MustConnect().NoDefaultDevice().MustPage("https://coinlist.co/login")
	page.MustWaitLoad()

	e := page.MustElement("div:not([style*='display:']) > iframe[data-hcaptcha-widget-id]")
	fmt.Println("element got")
	frame := e.MustFrame()
	fmt.Println("frame got ", frame.IsIframe())
	c, err := frame.Element("#checkbox")
	if err != nil {
		fmt.Println("query checkbox:", err)
		return
	}
	fmt.Println("element got")
	c.MustClick()
	select {}
}

这个测试案例的环境有点特殊,需要开一个windows的云服务器(绝对会出现验证界面),或者我把我现在的测试环境给你?问题点就在获取到frame过后,查找frame内到元素空指针了

@heyuanyou
Copy link
Author

依然无法复现问题,没有打印任何内容,我不知道为啥要写的如此复杂才行,所以我简化了下

package main

import (
	"fmt"

	"github.com/go-rod/rod"
)

func main() {
	page := rod.New().MustConnect().NoDefaultDevice().MustPage("https://coinlist.co/login")
	page.MustWaitLoad()

	e := page.MustElement("div:not([style*='display:']) > iframe[data-hcaptcha-widget-id]")
	fmt.Println("element got")
	frame := e.MustFrame()
	fmt.Println("frame got ", frame.IsIframe())
	c, err := frame.Element("#checkbox")
	if err != nil {
		fmt.Println("query checkbox:", err)
		return
	}
	fmt.Println("element got")
	c.MustClick()
	select {}
}

使用你简化后的代码,依然是这样的:
image

@heyuanyou
Copy link
Author

package main

import (
	"fmt"

	"github.com/go-rod/rod"
)

func main() {
	page := rod.New().MustConnect().NoDefaultDevice().MustPage("https://coinlist.co/login")
	page.MustWaitLoad()

	e := page.MustElement("div:not([style*='display:']) > iframe[data-hcaptcha-widget-id]")
	fmt.Println("element got")
	frame := e.MustFrame()
	fmt.Println("frame got ", frame.IsIframe())
	c, err := frame.Element("#checkbox")
	if err != nil {
		fmt.Println("query checkbox:", err)
		return
	}
	fmt.Println("element got")
	c.MustClick()
	select {}
}

可以复现了,把导航的目标地址替换为: https://captcha.website

@heyuanyou
Copy link
Author

是因为frame跨源导致的吗

@ysmood
Copy link
Member

ysmood commented Feb 19, 2022

现在以下可以复现这个问题了,原因还在调查中

package main

import (
	"github.com/go-rod/rod"
)

func main() {
	page := rod.New().MustConnect().NoDefaultDevice().MustPage("https://captcha.website")
	f := page.MustElement("div:not([style*='display:']) > iframe[data-hcaptcha-widget-id]").MustFrame()
	f.MustElement("#checkbox")
}

@ysmood ysmood reopened this Feb 19, 2022
@heyuanyou
Copy link
Author

有任何进展吗,boss

@ysmood
Copy link
Member

ysmood commented Feb 22, 2022

目前感觉是 chromium 的 bug,别的 iframe 网站也有类似问题吗?

@heyuanyou
Copy link
Author

目前感觉是 chromium 的 bug,别的 iframe 网站也有类似问题吗?

我目前测试了一个另外的页面,有同样的问题存在。
我现在在某个repo看到他们说,这些captcha页面限制了Selenium的(同理那cdp一类的都已经失效了),但不确定是否如此

@ysmood
Copy link
Member

ysmood commented Feb 22, 2022

主要是这个 Pierce: true 无法正常获取 iframe 里的内容,这感觉是 bug

node, err := proto.DOMDescribeNode{BackendNodeID: owner.BackendNodeID, Pierce: true}.Call(p)

@heyuanyou
Copy link
Author

主要是这个 Pierce: true 无法正常获取 iframe 里的内容,这感觉是 bug

node, err := proto.DOMDescribeNode{BackendNodeID: owner.BackendNodeID, Pierce: true}.Call(p)

我对html不是太熟悉,iframe里面的内容是js动态生成的,是否是有这方面影响呢

@ysmood
Copy link
Member

ysmood commented Feb 22, 2022

那也不该返回空 iframe

@ysmood
Copy link
Member

ysmood commented Mar 3, 2022

目前可以为 iframe 创建一个页面来操作这种 iframe,我在其他iframe页面没有碰到类似问题,这个页面似乎是个特例:

package main

import (
	"github.com/go-rod/rod"
	"github.com/go-rod/rod/lib/defaults"
	"github.com/go-rod/rod/lib/proto"
	"github.com/go-rod/rod/lib/utils"
)

func main() {
	defaults.Trace = true
	defaults.Show = true

	page := rod.New().MustConnect().NoDefaultDevice().MustPage("https://captcha.website")
	f := page.MustElement("div:not([style*='display:']) > iframe[data-hcaptcha-widget-id]").MustFrame()
	p := page.Browser().MustPageFromTargetID(proto.TargetTargetID(f.FrameID))
	p.MustElement("#checkbox").MustClick()
	utils.Pause()
}

@ysmood ysmood added the help wanted We wish someone can help us to work on it label Mar 3, 2022
@heyuanyou
Copy link
Author

目前可以为 iframe 创建一个页面来操作这种 iframe,我在其他iframe页面没有碰到类似问题,这个页面似乎是个特例:

package main

import (
	"github.com/go-rod/rod"
	"github.com/go-rod/rod/lib/defaults"
	"github.com/go-rod/rod/lib/proto"
	"github.com/go-rod/rod/lib/utils"
)

func main() {
	defaults.Trace = true
	defaults.Show = true

	page := rod.New().MustConnect().NoDefaultDevice().MustPage("https://captcha.website")
	f := page.MustElement("div:not([style*='display:']) > iframe[data-hcaptcha-widget-id]").MustFrame()
	p := page.Browser().MustPageFromTargetID(proto.TargetTargetID(f.FrameID))
	p.MustElement("#checkbox").MustClick()
	utils.Pause()
}

好的,非常感谢!

@ysmood
Copy link
Member

ysmood commented Mar 3, 2022

move this issue to #548

@ysmood ysmood closed this as completed Mar 3, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted We wish someone can help us to work on it question Questions related to rod
Projects
None yet
Development

No branches or pull requests

2 participants