You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
awaitpage.setRequestInterception(true);page.on('request',req=>{if(['image','stylesheet','font'].includes(req.resourceType())){returnrequest.abort();}returnrequest.continue();});// other stuff
这样在页面发出请求的时候, 不用加载图片, 样式和字体, 可以大大提高爬虫的性能和速度
The text was updated successfully, but these errors were encountered:
我们在爬取网站的时候, 一般比较关心网站的加载速度, 而限制加载速度的大多数是静态文件, 包括 css, font, image. 为了优化爬虫性能, 我们需要阻止浏览器加载这些不必要的文件, 这可以通过对请求进行拦截来实现
优化静态文件加载
这样在页面发出请求的时候, 不用加载图片, 样式和字体, 可以大大提高爬虫的性能和速度
The text was updated successfully, but these errors were encountered: