Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

还是老问题,有些网站无法抓取 #922

Closed
herozzm opened this issue Aug 22, 2023 · 7 comments
Closed

还是老问题,有些网站无法抓取 #922

herozzm opened this issue Aug 22, 2023 · 7 comments
Labels
question Questions related to rod

Comments

@herozzm
Copy link

herozzm commented Aug 22, 2023

Rod Version: 最新版本

之前在一个提问中 #819 说过类似这种网站都无法抓取,之前用了stealth短暂可以抓取但是后面又不行了,上个问题中虽然短暂解决了,但是没有找到原因,网站是这个:
https://www.ccgp-gansu.gov.cn

@herozzm herozzm added the question Questions related to rod label Aug 22, 2023
@github-actions
Copy link

Please add a valid Rod Version: v0.0.0 to your issue. Current version is v0.114.3

generated by check-issue

@Tony-Stark-marvel
Copy link

Tony-Stark-marvel commented Aug 23, 2023

可以试下不用stealth,然后设置下browser.DefaultDevice

var MyDevice = devices.Device{
	Title:          "Chrome computer",
	Capabilities:   []string{"touch", "mobile"},
	UserAgent:      "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/114.0.0.0 Safari/537.36",
	AcceptLanguage: "zh",
	Screen: devices.Screen{
		DevicePixelRatio: 2,
		Horizontal: devices.ScreenSize{
			Width:  1500,
			Height: 900,
		},
		Vertical: devices.ScreenSize{
			Width:  1500,
			Height: 900,
		},
	},
}

func main() {
	path, _ := launcher.LookPath()
	u := launcher.New().Bin(path).Set("--disable-blink-features", "AutomationControlled").Set("--user-agent", "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/114.0.0.0 Safari/537.36").Set("no-first-run").Set("disable-default-apps").Headless(false).MustLaunch()
	browser := rod.New().ControlURL(u).MustConnect()
	browser.DefaultDevice(MyDevice)
	page := browser.MustPage("https://www.ccgp-gansu.gov.cn/")
	time.Sleep(time.Hour * 10)
}

@ysmood
Copy link
Member

ysmood commented Aug 23, 2023

@Tony-Stark-marvel 试了下,确实可以的 👍🏼

@herozzm
Copy link
Author

herozzm commented Aug 24, 2023

@Tony-Stark-marvel 试了下,确实可以的 👍🏼

如果不是本地调试,换成ws要怎么写呢?

@ysmood
Copy link
Member

ysmood commented Aug 24, 2023

什么意思?

@herozzm
Copy link
Author

herozzm commented Aug 25, 2023

可以试下不用stealth,然后设置下browser.DefaultDevice

var MyDevice = devices.Device{
	Title:          "Chrome computer",
	Capabilities:   []string{"touch", "mobile"},
	UserAgent:      "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/114.0.0.0 Safari/537.36",
	AcceptLanguage: "zh",
	Screen: devices.Screen{
		DevicePixelRatio: 2,
		Horizontal: devices.ScreenSize{
			Width:  1500,
			Height: 900,
		},
		Vertical: devices.ScreenSize{
			Width:  1500,
			Height: 900,
		},
	},
}

func main() {
	path, _ := launcher.LookPath()
	u := launcher.New().Bin(path).Set("--disable-blink-features", "AutomationControlled").Set("--user-agent", "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/114.0.0.0 Safari/537.36").Set("no-first-run").Set("disable-default-apps").Headless(false).MustLaunch()
	browser := rod.New().ControlURL(u).MustConnect()
	browser.DefaultDevice(MyDevice)
	page := browser.MustPage("https://www.ccgp-gansu.gov.cn/")
	time.Sleep(time.Hour * 10)
}

感谢,成功解决

什么意思?

搞定了,就是访问docker部署的浏览器

@ysmood
Copy link
Member

ysmood commented Aug 25, 2023

l := launcher.MustNewManaged("")

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Questions related to rod
Projects
None yet
Development

No branches or pull requests

3 participants