Export any Kindle book you own as text, PDF, EPUB, or as a custom, AI-narrated audiobook. 🔥
This project makes it easy to export the contents of any ebook in your Kindle library as text, PDF, EPUB, or as a custom, AI-narrated audiobook. It only requires a valid Amazon Kindle account and an OpenAI API key.
You must own the ebook on Kindle for this project to work.
It works by logging into your Kindle web reader account using Playwright, exporting each page of a book as a PNG image, and then using a vLLM (gpt-4o
or gpt-4o-mini
) to transcribe the text from each page to text. Once we have the raw book contents and metadata, then it's easy to convert it to PDF, EPUB, etc. 🔥
This example uses the first page of the scifi book Revelation Space by Alastair Reynolds:
We can even use TTS to generate custom audiobooks.
Here are some auto-generated examples using a few different TTS providers & voices, containing only the first page of this book as a preview:
OpenAI tts-1-hd "alloy" voice (female; solid quality but more expensive) |
openai-alloy-preview.mp4 |
OpenAI tts-1-hd "onyx" voice (male; solid quality but more expensive) |
openai-onyx-preview.mp4 |
Unreal Speech "Scarlett" voice (female; medium quality but cheaper) |
unrealspeech-scarlett-preview.mp4 |
Kindle uses a custom AZW3 format which includes heavy DRM, making it very difficult to access the contents of ebooks that you own. It is possible to strip the DRM using existing tools, but it's a serious pain in the ass, is very difficult to automate, and the "best" solution is expensive and not open source.
This project changes that.
Why? Because I love reading books on Kindle (especially scifi books!!), but none of the content is hackable. The official Kindle apps are also lagging behind in their AI features, so my goal with this project was to make it easy to build AI-powered experiments on top of my own Kindle library. In order to do that, I first needed a reliable way to export the contents of my Kindle books in a reasonable format.
I also created an OSS TypeScript client for the unofficial Kindle API, but I ended up only using some of the types and utils since Playwright + vLLMs allowed me to completely bypass their API and DRM. This approach should also be a lot less error-prone than using their unofficial API.
Make sure you have node >= 18
and pnpm installed.
- Clone this repo
- Run
pnpm install
- Set up environment variables (details)
- Run
src/extract-kindle-book.ts
(details) - Run
src/transcribe-book-content.ts
(details) - (Optional) Run
src/export-book-pdf.ts
(details) - (Optional) Export book as EPUB (details)
- (Optional) Run
src/export-book-markdown.ts
(details) - (Optional) Run
src/export-book-audio.ts
(details)
Set up these required environment variables in a local .env
:
AMAZON_EMAIL=
AMAZON_PASSWORD=
ASIN=
OPENAI_API_KEY=
You can find your book's ASIN (Amazon ID) by visiting read.amazon.com and clicking on the book you want to export. The resulting URL will look like https://read.amazon.com/?asin=B0819W19WD&ref_=kwl_kr_iv_rec_2
, with B0819W19WD
being the ASIN in this case.
npx tsx src/extract-kindle-book.ts
- (This takes a few minutes to run)
- This logs into your Amazon Kindle web reader using headless Chrome (Playwright). It can be pretty fun to watch it run, so feel free to tweak the script to use
headless: false
to watch it do its thing. - If your account requires 2FA, the terminal will request a code from you before proceeding.
- It uses a persistent browser session, so you should only have to auth once.
- Once logged in, it navigates to the web reader page for a specific book (
https://read.amazon.com/?asin=${ASIN}
). - Then it changes the reader settings to use a single column and a sans-serif font.
- Then it extracts the book's table of contents.
- Then it goes through each page of the book's main contents and saves a PNG screenshot of the rendered content to
out/${asin}/pages/${index}-${page}.png
. - Example: examples/B0819W19WD/pages
- Lastly, it resets the reader to the original position so your reading progress isn't affected.
- It also records some JSON metadata with the TOC, book title, author, product image, etc to
out/${asin}/metadata.json
. - Example: examples/B0819W19WD/metadata.json
Note
I'm pretty sure Kindle's web reader uses WebGL at least in part to render the page contents, because the content pages failed to generate when running this on a VM (Browserbase). So if you're getting blank or invalid page screenshots, that may be the reason.
npx tsx src/transcribe-book-content.ts
- (This takes a few minutes to run)
- This takes each of the page screenshots and runs them through a vLLM (
gpt-4o
orgpt-4o-mini
) to extract the raw text content from each page of the book. - It then stitches these text chunks together, taking into account chapter boundaries.
- The result is stored as JSON to
out/${asin}/content.json
. - Example: examples/B0819W19WD/content.json
npx tsx src/export-book-pdf.ts
- (This should run instantly)
- It uses PDFKit under the hood.
- It includes a valid table of contents for easy navigation.
- The result is stored to
out/${asin}/book.pdf
. - Example: examples/B0819W19WD/book-preview.pdf
If you want, you can use Calibre to convert your book's PDF to the EPUB ebook format. On a Mac, you can install calibre
using Homebrew (brew install --cask calibre
).
# replace B0819W19WD with your book's ASIN
ebook-convert out/B0819W19WD/book.pdf out/B0819W19WD/book.epub --enable-heuristics
npx tsx src/export-book-markdown.ts
- (This should run instantly)
- The result is stored to
out/${asin}/book.md
. - Example: examples/B0819W19WD/book-preview.md
npx tsx src/export-book-audio.ts
- This takes a few minutes to run.
- We support two TTS engines: OpenAI TTS and Unreal Speech TTS.
- To use OpenAI, set
TTS_ENGINE=openai
(the default) - To use Unreal Speech, set
TTS_ENGINE=unrealspeech
andUNREAL_SPEECH_API_KEY=(your-api-key)
- OpenAI is higher quality but more expensive; Unreal Speech is medium quality and cheaper
- To set the OpenAI voice, use
OPENAI_TTS_VOICE=onyx
(defaults toalloy
) - To set the Unreal Speech voice, use
UNREAL_SPEECH_VOICE='Scarlett'
(defaults toScarlett
) - OpenAI TTS for a full novel (~1M tokens) is approximately $30 (1.5GB MP3 ~21 hours long)
- Unreal Speech TTS for a full novel (~1M tokens) is approximately $2 (1.7GB MP3 ~23 hours long)
- It should be pretty easy to support other TTS providers in the future.
- To use OpenAI, set
- The TTS will be broken up into reasonly sized chunks and stored in
mp3
files underout/${asin}/audio/<tts-engine-hash>/
.- The
<tts-engine-hash>
directory is based on the TTS engine settings and book contents
- The
- After generating audio for each chunk, we use
ffmpeg
to concat them together.- You need to have
ffmpeg
installed locally for this to work - On Mac,
brew install ffmpeg
(or install with more options)
- You need to have
- The resulting audiobook is stored to
out/${asin}/audio/<tts-engine-hash>/audiobook.mp3
. - Examples: examples/B0819W19WD/audio-previews
This project is intended purely for personal and educational use only. It is not endorsed or supported by Amazon / Kindle. By using this project, you agree to not hold the author or contributors responsible for any consequences resulting from its usage.
This project will only work on Kindle books which you have access to in your personal library. Please do not share the resulting exports publicly – we need to make sure that our authors and artists get paid fairly for their work!
With that being said, I also feel strongly that we should individually be able to use content that we own in whatever format best suits our personal needs, especially if that involves building cool, open source experiments for LLM-powered book augmentation, realtime narration, and other unique AI-powered UX ideas.
I expect that Amazon Kindle will eventually get around to supporting some modern LLM-based features at some point in the future, but ain't nobody got time to wait around for that.
If you want to explore other ways of exporting your personal ebooks from Kindle, this article gives a great breakdown of the options available, including Calibre (FOSS) and Epubor Ultimate (paid). Trying to use the most popular free online converter will throw a DRM error.
Compared with these approaches, the approach used by this project is much easier to automate. It also retains metadata about Kindle's original sync positions which is very useful for cases where you'd like to interoperate with Kindle. E.g., be able to jump from reading a Kindle book to listening to an AI-generated narration on a walk and then jumping back to reading the Kindle book and having the sync positions "just work".
The main downside is that it's possible for some transcription errors to occur during the image ⇒ text
step - which uses a multimodal LLM and is not 100% deterministic. In my testing, I've been remarkably surprised with how accurate the results are, but there are occasional issues mostly with differentiating whitespace between paragraphs versus soft section breaks. Note that both Calibre and Epubor also use heuristics to deal with things like spacing and dashes used by wordwrap, so the fidelity of the conversions will not be 100% one-to-one with the original Kindle version in any case.
The other downside is that the LLM costs add up to a few dollars per book using gpt-4o
or around 30 cents per book using gpt-4o-mini
. With LLM costs constantly decreasing and local vLLMs, this cost per book should be free or almost free soon. The screenshots are also really good quality with no extra content, so you could swap any other OCR solution for the vLLM-based image ⇒ text
quite easily.
The accuracy / fidelity has been very close to perfect in my testing, with the only discrepancies being occasional whitespace issues.
I'm sure there will be edge cases and ebook features that are missing (like embedded images), but it shouldn't be too hard to add those if there's enough interest.
MIT © Travis Fischer
If you found this project interesting, consider following me on Twitter.