Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Filtering mechanism for sender/recipient emails #28

Open
4 of 9 tasks
gwiedeman opened this issue Sep 7, 2021 · 4 comments
Open
4 of 9 tasks

Filtering mechanism for sender/recipient emails #28

gwiedeman opened this issue Sep 7, 2021 · 4 comments
Labels
Core This is part of the main process for creating mailbags Input Parsing input data, such as MBOX, IMAP, PST, EML, etc.

Comments

@gwiedeman
Copy link
Collaborator

The problem the component solves

Requirement #35: "Provide a method of keeping or excluding specific email folders while creating Mailbags."

A user may like to exclude a list of email folders, like "Drafts," "Trash," etc., or potentially a list of native Message-IDs to exclude individual emails from a mailbag. I imagine this would be a command line path to a text file or something. If this option is used, Mailbag must exclude these folders or messages when creating derivatives and documenting metadata in mailbag.csv, bag-info.txt and other tag files. The hard part is that these would also have to be excluded from the source file, like the MBOX or PST file that gets save in a mailbag as well.

The specification also has a space for documenting these exclusions in folders_not_retained.txt or messages_not_retained.txt

Relevant part of mailbag spec?

5.4 folders_not_retained.txt and messages_not_retained.txt

Type of component

  • Core
  • Input
  • Attachments
  • Derivatives conversion
  • Reporting/Exporting
  • GUI
  • Distribution

Expected contribution

  • Pull Request
  • Comment with proposed solution

Major challenges or things to keep in mind

We'll have to exclude these folders or messages in the source files as well if we're keeping them in the output mailbag. The idea is that email often has to be excluded for legal or privacy reasons, so we don't want this data in the mailbag at all. Edit an mbox or removing EML/MSG files should be feasible, but editing a PST I expect to be a problem,

@gwiedeman gwiedeman added Core This is part of the main process for creating mailbags Input Parsing input data, such as MBOX, IMAP, PST, EML, etc. labels Sep 7, 2021
@gwiedeman
Copy link
Collaborator Author

@haritgarg Next you should look into editing MBOX and PST file or potentially creating them from scratch to see how feasible this would be.

@gwiedeman
Copy link
Collaborator Author

Exclusions should be done in the controller and affect writing to CSV and derivatives creation.

@gwiedeman
Copy link
Collaborator Author

This is on hold since excluding messages for PST outputs (either derivatives or from the original file) isn't feasible. We have to reexamine what the exclusion functionality purpose and use cases are.

@gwiedeman
Copy link
Collaborator Author

gwiedeman commented Apr 15, 2022

Since editing certain types of input files (PST, MSG) isn't feasible, (#25) I think the path for this is to excluding the entire source during packaging. Say I have a PST file, and I only want to package a certain folder. Any folder or message exclusions would also exclude the PST file, and you would have to rely only on derivatives such as MBOX, EML, etc. There would be lossiness in these derivatives, so while I think filtering is a common use case, it would have to be worth the lossiness risk. It raises other issues too, since we basically run a shutil.move on the input files so If we're excluding them we should leave them as-is. I think this path is reasonable, but since the usefulness of this is in doubt, we'll de-prioritize this and comment out the argparse options for now. We'll probably wait until we get community feedback asking for this feature before it will get implemented.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Core This is part of the main process for creating mailbags Input Parsing input data, such as MBOX, IMAP, PST, EML, etc.
Projects
Development

No branches or pull requests

2 participants