scraper

Spider

To run the scraper, from the spiders directory use scrapy runspider scrape.py. The current default method is set to get all files, but it can also be configured to only get MIFR files. All of the reports will be saved to the reports folder.

XML Parsing

To run the xml parser, run ./xmlformat.py <file-path>. This will output the domains file for the corresponding XML file.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
spiders		spiders
.gitignore		.gitignore
README.md		README.md
__init__.py		__init__.py
settings.py		settings.py
xmlformat.py		xmlformat.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

scraper

Spider

XML Parsing

About

Releases

Packages

Languages

SKIIDK/scraper

Folders and files

Latest commit

History

Repository files navigation

scraper

Spider

XML Parsing

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages