SharePoint Ripper

Screen scrape a SharePoint wiki to Markdown

This is largely intended for importing to Confluence with the markdown-to-confluence-uploader or direct entry

This script requires a manual patch to Mechanize described here

Copy config.yml.dist to config.yml and enter your details

Option	Meaning
username	Your SharePoint username
password	Your SharePoint password
sharepoint_url	The base domain of your SharePoint instance
wiki_base_url	The subdirectory of your particular wiki. This ripper will only process pages under this base_url
wiki_index	The index file of the wiki to start with. Assumes a big list of links to other pages under `wiki_base_url`
scrape_recursively	Whether to follow links and continue scraping or just the index page
content_div_id	The id of the innermost div that your pages all share
confluence_space_key	The key of your Confluence space
direct_confluence_entry	True if you'll copy and paste the resultant markdown straight into the wiki entry option of Confluence. False if you're using the markdown-to-confluence-uploader
add_legacy_link	Whether to add a link to the bottom of each page linking back to the legacy SharePoint source page

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
.github/workflows		.github/workflows
.gitignore		.gitignore
README.md		README.md
config.yml.dist		config.yml.dist
ripper.py		ripper.py

Provide feedback