Skip to content

Latest commit

 

History

History
31 lines (19 loc) · 1.11 KB

README.md

File metadata and controls

31 lines (19 loc) · 1.11 KB

Conservancy

OVERVIEW

A utility for preserving websites on mirrors.UNNA.org. It is primarily a wrapper around wget, but performing additional verification tasks.

PREREQUISITES

USAGE

Running conserve <url> will slowly and recursively mirror the page, plus sibling & child pages (but not parent pages), using wget.

The site will be archived into a directory named for the URL's hostname. A wget log file will also be generated.

Upon completion, it'll output the following to STDOUT:

  • Any missing files (linking to Internet Archive's Wayback Machine if the files exist there, plus listing any similarly named files that were downloaded)
  • Any files still containing links to the URL

One can then manually try to find & replace missing files, clean up links, etc.

REFERENCE