-
Notifications
You must be signed in to change notification settings - Fork 100
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Duplicate Images #9
Comments
My first attempt to filter out duplicates would be to subtract two possible duplicated images and check if the difference is close to zero. |
I'm getting the same. Downloaded 10000 pictures and 9789 of them were duplicates. Is this a nature of Bing image search, or particular to this downloader? |
When I scrape 100 photos, after the first 85 to 90 images, they start to repeat, and the rest are all duplicates. |
Ya I also faced same issue it was due to how its programed i.e there is no next page in bing so instead first=pagecounter -> do first len of total url visited |
I successfully avoided duplicated images with the following code. But now it will search forever. So yeah, `
` |
Remove duplicates PR#20 |
Bumping this as an issue. The fix above looks like it works and would be great if merged. Thanks! |
Please close this issue |
I am trying to create a food dataset. However, when I try to scrape from Bing using this library, I am getting a lot of duplicate images. Please assist.
Thank you
The text was updated successfully, but these errors were encountered: