Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added on-disk MongoDB compatible MontyStore #514

Merged
merged 5 commits into from
Dec 6, 2021

Conversation

utf
Copy link
Member

@utf utf commented Dec 1, 2021

In my opinion, one missing component of the maggma stores is a persistent on disk store that can be accessed simultaneously from multiple python processes. Think an almost fully formed MongoDB instance but hosted locally on the file system. I think this is what #507 is hinting towards.

In this PR I've implemented a store based on the MontyDB backend. This provides a MongoDB interface to several on disk stores including:

  • sqlite: Uses an sqlite database to store documents.
  • lightning: Uses Lightning Memory-Mapped Database (LMDB) for storage. This can provide fast read and write times but requires lmdb to be installed (in most cases this can be achieved using pip install lmdb).
  • flatfile: Uses a system of flat json files. This is not recommended as multiple simultaneous connections to the store will not work correctly.

The data is stored in a folder on disk controlled by the database_path option. Multiple databases and collections are stored as different files.

I've added full tests and MontyDB as an optional requirement. @davidwaroquiers you may be interested in this.

@codecov
Copy link

codecov bot commented Dec 1, 2021

Codecov Report

Merging #514 (a0c5e59) into main (6de8739) will increase coverage by 0.04%.
The diff coverage is 91.66%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #514      +/-   ##
==========================================
+ Coverage   88.79%   88.84%   +0.04%     
==========================================
  Files          40       40              
  Lines        2696     2743      +47     
==========================================
+ Hits         2394     2437      +43     
- Misses        302      306       +4     
Impacted Files Coverage Δ
src/maggma/stores/mongolike.py 85.75% <91.66%> (+0.92%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 5a0f33b...a0c5e59. Read the comment docs.

@shyamd
Copy link
Contributor

shyamd commented Dec 1, 2021

Is there any reason to not just replace the monogmock MemoryStore based implementations with this montydb variant?

@utf
Copy link
Member Author

utf commented Dec 1, 2021

Maybe. The existing memory store is based on mongomock which I think is a more feature complete replica of pymongo. For example, it implements bulk_write whereas MontyDB does not.

@munrojm
Copy link
Member

munrojm commented Dec 6, 2021

@utf I agree with your sentiment, and like this implementation. Are you good for me to merge?

@utf
Copy link
Member Author

utf commented Dec 6, 2021

Yeah all good from me. I've been using this and seems to work well.

@munrojm
Copy link
Member

munrojm commented Dec 6, 2021

Sounds good!

@munrojm munrojm merged commit c076036 into materialsproject:main Dec 6, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants