You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
2025-01-10 13:07:32 httpx INFO (_client.py:1773) - HTTP Request: GET https://m.weibo.cn/comments/hotflow?id=5117251109785706&mid=5117251109785706&max_id_type=0&max_id=5118541873545341 "HTTP/1.1 200 OK"
2025-01-10 13:07:34 MediaCrawler ERROR (core.py:215) - [WeiboCrawler.get_note_comments] may be been blocked, err:'>' not supported between instances of 'NoneType' and 'int'
2025-01-10 13:07:34 MediaCrawler INFO (core.py:103) - [WeiboCrawler.start] Weibo Crawler finished ...
2025-01-10 13:07:32 httpx INFO (_client.py:1773) - HTTP Request: GET https://m.weibo.cn/comments/hotflow?id=5117251109785706&mid=5117251109785706&max_id_type=0&max_id=5118541873545341 "HTTP/1.1 200 OK"
2025-01-10 13:07:34 MediaCrawler ERROR (core.py:215) - [WeiboCrawler.get_note_comments] may be been blocked, err:'>' not supported between instances of 'NoneType' and 'int'
2025-01-10 13:07:34 MediaCrawler INFO (core.py:103) - [WeiboCrawler.start] Weibo Crawler finished ...
在爬取微博指定帖子的全部评论信息时,引起以下报错
报错信息:
阅读源码后,发现在
media_platform/weibo.client.py
下的get_note_all_comments
函数中,存在一行当get方法获取max_id为None时,下一轮循环传入给
get_note_comments
函数的max_id为None而不是期望接受的int,但剩余很多的评论没有爬取,因此不应该停止循环,而是在get_note_comments
函数中捕获这个异常评论并继续爬取,将get_note_comments
中的更改为
即可解决
最后的解决代码截图如下:
The text was updated successfully, but these errors were encountered: