-
Notifications
You must be signed in to change notification settings - Fork 0
A rust tool to web scrape wikipedia categories and subcategories.
License
maniyar1/wikipedia-category-downloader
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
Simple tool to download a wikipedia category, example usage: Download pages from this category and (default) one level down of sub-categories ./wikipedia-category-downloader https://en.wikipedia.org/wiki/Category:Marxism Download pages just from this category, no sub-categories ./wikipedia-category-downloader https://en.wikipedia.org/wiki/Category:Marxism -l 0 It will store everything in ./wiki/, because wikipedia uses absolute link paths I recommend serving these with something like twisted (twistd). You can use the python web server (python3 -m http.server) but it doesn't correctly identify files as html. You can use the -a flag to output files with .html endings, but then links won't work.
About
A rust tool to web scrape wikipedia categories and subcategories.
Resources
License
Stars
Watchers
Forks
Packages 0
No packages published