Skip to content

tsheiner/medium-archive-image-downloader

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 

Repository files navigation

Medium Archive Image Downloader

Overview

The backup archive you can request from Medium does not include the images in the posts. This script addresses that limitation.

The script creates an export subdirectory inside the posts directory of the archive and then, for every original html file in the posts directory it:

  • creates an appropriately named subdirectory
  • creates in that subdirectory a copy of the original html file and an img directory
  • downloads all images referenced in each HTML file
  • updates the src attributes in the copied html file to point to the downloaded images.

Installation

Prerequisites

  • Python 3.x
  • Pip (Python package installer)

Dependencies

The script requires the following Python libraries:

  • requests
  • beautifulsoup4

To install these dependencies, run the following command at your terminal:

pip install beautifulsoup4
pip install requests

Usage

  1. Request a backup archive from Medium

    1. Access your Medium account by going to Medium's website and logging in with your credentials.
    2. Once logged in, navigate to your profile picture in the upper right corner and select it to reveal a dropdown menu. From this menu, choose "Settings."
    3. Go to the "Security and apps" tab and click on "Download Your Information."
    4. After requesting your data, Medium will process this request, which may take some time. They will send you an email with a link to download your data once it is ready.
    5. Follow the link in the email to download your writings. The data will typically be in a ZIP file containing your posts and other information associated with your Medium account.
    6. Once you have downloaded the ZIP file, extract it and review the contents to find your writings.
  2. Place the medium-archive-image-downloader.py in the posts directory of the archive:

  3. Execute the script by running:

    python medium_archive_processor.py
    
  4. Check the Output:

    • After running the script, you'll find each HTML file processed into its own directory within an 'export' directory.
    • if any download errors occurred the will be listed in a file errors.txt

Background

This script was created as a collaborative effort between Tim Sheiner (who provided the requirements) and ChatGPT4 developed by OpenAI (who wrote the script). It was inspired by two pre-existing scripts that no longer function correctly probably because Medium changed the archive format since the scripts were written:

License

No restrictions

About

A python script to add images to archive of Medium posts

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages