-
Notifications
You must be signed in to change notification settings - Fork 69
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Autosave triggered by single thread and not global. #56
Comments
@rivermont Can you please elaborate this? I would like to understand all the cases where this will result in errors. |
This commit fixes errors while autosaving by single thread. Specifically it resolves the discrepancies in the contents saved. Fixes rivermont#56
This commit fixes errors while autosaving by single thread. Specifically it resolves the discrepancies in the contents saved. Fixes rivermont#56
I could find out few errors while auto saving and made a PR for the same. Also, I couldn't find a way to fix logging which takes minimal change. Maybe need to revamp the logging logic so that crawling logging is paused when saving. |
fix Travis errors as on main branch Fix Autosave errors This commit fixes errors while autosaving by single thread. Specifically it resolves the discrepancies in the contents saved. Fixes rivermont#56 docs: update docker instructions to specify how users can pass custom config to spidy in docker
Checklist
Expected Behavior
All threads to stop as crawler prints info and saves files.
Actual Behavior
Once one thread reaches
SAVE_COUNT
links crawled, it saves while the other threads continue. This results in[CRAWL]
logs in between[INFO]
logs.It seems like this is inefficient and could result in some saving errors.
Steps to Reproduce the Problem
Specifications
The text was updated successfully, but these errors were encountered: