Skip to content

📧 Mail reply parser library for Python with multi-language support

License

Notifications You must be signed in to change notification settings

Didza/mail-parser-reply

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Mail Reply Parser 📧🐍

Python Version

Multi-language email reply parsing for international environments 🌍

Mail clients handle reply formatting differently, making reliable parsing difficult. Thank god we have standards. This library splits text-based emails into separate replies based on common headers produced by different, multilingual clients usually indicating separation.

Replies can either present the whole mail message body, or strip headers, signatures and common disclaimers if required. Currently supported languages are: English (en), German (de), French (fr) – adding more languages is quite easy.

This is an improved Python implementation of GitHub's Ruby-based email_reply_parser and an adaptation of Zapier's email-reply-parser which both split the mails in fragments instead of distinct replies. They also only support English.

⭐ Features

⭐ Easy to implement
⭐ Multilanguage Support
⭐ Text-based mail parsing
⭐ Detect headers, signatures and disclaimers
⭐ Fully type annotated
⭐ Easy-to-read code and well-tested

Overview 🔭

This library makes it easy to split an incoming mail into replies, making working with emails much more manageable and easily providing the text content for each reply – with or without signatures, disclaimers and headers.

For example, it can turn the following email:

Awesome! I haven't had another problem with it.

Thanks,
alfonsrv

On Wed, Dec 20, 2023 at 13:37, RAUSYS <[email protected]> wrote:

> The good news is that I've found a much better query for lastLocation.
> It should run much faster now. Can you double-check?

Into just the replied text content:

Awesome! I haven't had another problem with it.

Get started 👾

Installation

pip install mail-parser-reply

Parse Replies

from mailparser_reply import EmailReplyParser

mail_body = 'foobar'; languages = ['en', 'de']
mail_message = EmailReplyParser.read(text=mail_body, languages=languages)
print(mail_message.replies)

Or get only the latest reply using:

latest_reply = EmailReplyParser.parse_reply(text=mail_body, languages=languages)

Parser API

EmailMessage.text: Mail body
EmailMessage.languages: Languages to use for parsing headers
EmailMessage.replies: List of EmailReply; single parsed replies
EmailMessage.include_english: Always include English language for parsing
EmailMessage.default_language: Default language to use if language dictionary 
                               doesn't include

EmailMessage.HEADER_REGEX: RegEx for identifying headers, separating mails
EmailMessage.SIGNATURE_REGEX: RegEx for identifying signatures
EmailMessage.DISCLAIMERS_REGEX: RegEx for identifying disclaimers

EmailMessage.read(): Parse EmailMessage.text to EmailReply which are then stored 
                     in EmailMessage.replies


EmailReply.content: Unprocessed mail body with headers, signatures, disclaimers
EmailReply.body: Mail body without headers, signatures, disclaimers
EmailReply.full_body: Mail body; just without headers

EmailReply.headers: Identified Headers
EmailReply.signatures: Identified Signatures
EmailReply.disclaimers: Identified disclaimers

Buy me a Coffee

About

📧 Mail reply parser library for Python with multi-language support

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%