web-archiving

Star

Here are 8 public repositories matching this topic...

oduwsdl / warrick

Star

Recover lost websites from the Web Infrastructure

memento recovery web-archiving memento-rfc

Updated Feb 10, 2021
HTML

internetarchive / sandcrawler

Star

Backend, IA-specific tools for crawling and processing the scholarly web. Content ends up in https://fatcat.wiki

web-archiving

Updated Jul 31, 2024
HTML

ArchiveBox / DigestBox

Sponsor

Star

DigestBox takes any webpage URL (news article, video link, comment thread, etc.) and gives you just the raw content. It's powered by ArchiveBox.io under the hood.

backups warc web-archiving digipres headless-browser internet-archiving archivebox

Updated Feb 2, 2024
HTML

oduwsdl / oduwsdl.github.io

Star

ODU Web Science and Digital Libraries Research Group (WS-DL) home page.

machine-learning natural-language-processing information-retrieval web-science web-archiving digital-preservation digital-libraries

Updated Oct 15, 2024
HTML

nla / nla-pywb

Star

pywb config overlay for the Australian Web Archive

web-archiving

Updated Nov 12, 2024
HTML

ArchivingToolsForWBM / AdvancedInternetArchiving

Star

Makes saving pages in bulk to the wayback machine much easier

web-archiving webarchiving

Updated Nov 18, 2024
HTML

TarekJor / wpull

Star

Wget-compatible web downloader and crawler.

crawler backup bookmarks wget web-archiving browsers preservation web-page webarchiving wpull web-downloader web-pages web-browsers

Updated Dec 20, 2017
HTML

httpreserve / mementoqa

Star

QA Mementos using Screenshots

memento web-archiving wayback-machine code4lib digital-preservation

Updated Jun 16, 2021
HTML

Improve this page

Add a description, image, and links to the web-archiving topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the web-archiving topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

web-archiving

Here are 8 public repositories matching this topic...

oduwsdl / warrick

internetarchive / sandcrawler

ArchiveBox / DigestBox

oduwsdl / oduwsdl.github.io

nla / nla-pywb

ArchivingToolsForWBM / AdvancedInternetArchiving

TarekJor / wpull

httpreserve / mementoqa

Improve this page

Add this topic to your repo