Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JSONStore: enabled reading of MongoDB extended JSON files #847

Merged
merged 1 commit into from
Aug 29, 2023

Conversation

rkingsbury
Copy link
Collaborator

@rkingsbury rkingsbury commented Aug 28, 2023

Summary

When exporting a MongoDB collection to JSON, the resulting format is extended JSON. Trying to load such a JSON file into JSONStore currently fails due to the way extended JSON serializes ObjectID, with an error similar to

bson.errors.InvalidDocument: key '$oid' must not start with '$'

This PR modifies JSONStore.read_json to use the bson.json_util.loads method if $oid is detected in the file, which allows loading extended JSON files into a JSONStore. If $oid is not there, the current orjson.loads() is used instead.

Unfortunately. orjson.loads does not support BSON types. This was requested before, although it isn't clear if the maintainer understood the request.

Happy to hear opinions if there are better ways to accomplish this, e.g. by leveraging serialization_helper, MontyDecoder, etc. and/or opening another orjson issue with a more specific request about how to handle these types.

@rkingsbury rkingsbury requested a review from munrojm August 28, 2023 01:48
@codecov
Copy link

codecov bot commented Aug 28, 2023

Codecov Report

Patch coverage: 100.00% and no project coverage change.

Comparison is base (7d04142) 88.14% compared to head (fced54c) 88.14%.

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #847   +/-   ##
=======================================
  Coverage   88.14%   88.14%           
=======================================
  Files          44       44           
  Lines        3592     3593    +1     
=======================================
+ Hits         3166     3167    +1     
  Misses        426      426           
Files Changed Coverage Δ
src/maggma/stores/mongolike.py 88.42% <100.00%> (+0.03%) ⬆️

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@rkingsbury
Copy link
Collaborator Author

Flagging @arosen93 as well for any comments

@Andrew-S-Rosen
Copy link
Member

Andrew-S-Rosen commented Aug 28, 2023

No comments from me here. Looks great! I'm not terribly knowledgeable about the (de)serialization aspect on this.

@rkingsbury
Copy link
Collaborator Author

@munrojm , any thoughts before I merge?

@munrojm
Copy link
Member

munrojm commented Aug 29, 2023

@rkingsbury Nope! Looks good to me.

Copy link
Member

@munrojm munrojm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@rkingsbury rkingsbury merged commit 7353e57 into materialsproject:main Aug 29, 2023
@rkingsbury rkingsbury deleted the ext_json branch August 29, 2023 20:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants