You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently, dosage downloads comic in a very straightforward way:
Get page
Parse page
Get images
Continue with next page
For better performance, the user can decide to run download multiple comics in parallel (via the -p option) - but that's more of a clutch - the threads aren't aware of each other, which could lead to the situation where multiple threads fetch comics from the same hoster.
We should evaluate a better scheduling system, satisfying at least the following requirements:
Parallel downloads from multiple hosts
Throttling per host (we don't want to overload a hoster)
Image downloads can be handled separate from page parsing
It might be worthwhile to look at things like asyncio, async/await or something like that...
The text was updated successfully, but these errors were encountered:
We may want to evaluate Scrapy, it seems perfect for the job.
Yes, that might be one solution, but probably a pretty hefty one... I'm certainly not a fan of reinventing the wheel, but this particular wheel seems to bring the whole caravan with it 😉
Currently, dosage downloads comic in a very straightforward way:
For better performance, the user can decide to run download multiple comics in parallel (via the
-p
option) - but that's more of a clutch - the threads aren't aware of each other, which could lead to the situation where multiple threads fetch comics from the same hoster.We should evaluate a better scheduling system, satisfying at least the following requirements:
It might be worthwhile to look at things like asyncio, async/await or something like that...
The text was updated successfully, but these errors were encountered: