Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Massive refactoring to get ready for v1.0 #62

Merged
merged 100 commits into from
Dec 28, 2019
Merged

Massive refactoring to get ready for v1.0 #62

merged 100 commits into from
Dec 28, 2019

Conversation

shyamd
Copy link
Contributor

@shyamd shyamd commented Nov 7, 2019

This is a structural refactoring and doesn't complete the process of getting to 1.0. There are additional goals in the documentation and some additional features necessary for that, but those will come in their own PRs.

The big goals are to:

  • make the core structure better organized
  • add-in type hints for everything
  • get to 85% test coverage
  • replace MPI with zeroMQ

Edit:
Reducing coverage goal to 85% as 100% is not-realistic for now.
Scrapping plans for zeroMQ multi-node implementation as this will likely require some async magic

@shyamd shyamd changed the title WIP: Massive refactoring to get ready for v1.0 Massive refactoring to get ready for v1.0 Dec 26, 2019
@shyamd shyamd requested a review from mkhorton December 26, 2019 06:18
@shyamd shyamd merged commit e8311c7 into master Dec 28, 2019
@mkhorton
Copy link
Member

mkhorton commented Jan 1, 2020

Just coming back online so only just seen this -- this is one big PR! Difficult to read the diff since it seems like a lot of code has moved, what would you say are the biggest changes for someone wanting to write or run a builder? I saw mrun is gone also and there's a new cli.

Other questions:

  • Which Stores would you say are "production-ready" at this point, i.e. have seen real-world use?
  • Any progress or plans on improving reporting of builds started/finished/failed? It'd be really useful for us to be able to query a reporting collection and know that Collection A is currently being written to by Builder X for example.

@shyamd
Copy link
Contributor Author

shyamd commented Jan 2, 2020

No worries. Realized this was such a large change that it wasn't appropriate for a review. This is mostly moving stuff around to have a coherent structure and adding in all the things we've been missing: tests, docstrings, and type annotations.

Stores that have seen real-world use:

  • MongoStore
  • MemoryStore
  • JSONStore
  • GridFSStore
  • MongograntStore
  • ConcatStore

Yes, I still want to add in build indicators, but I'm not yet sure how to do that. The big issue is how to make this generic so everyone doesn't have to implement something.

@shyamd
Copy link
Contributor Author

shyamd commented Jan 2, 2020

mrun isn't gone. It's just been rewritten. It still gets installed via setup.py

@mkhorton
Copy link
Member

mkhorton commented Jan 2, 2020

Ok, got it, thanks. What's the distinction between JointStore and ConcatStore? Feels like both might be prone to bugs. I remember JointStore came about for specific performance reasons, can't remember where it's used though.

Re. build indicators/reporting, my idea was to add an extra kwarg to the base Builder class, say reporting_store. If set, when that Builder is run it would add a new document to the reporting store when it's started and when it ends, and inside that document would just be the name and configuration of that builder. Then the "reporting_store" could be read as a simple log file. If the user doesn't specific the reporting_store, then the build just isn't logged -- by consensus, we can run all production builds to the same reporting_store.

I don't know that that's the best idea, but it does avoid requiring the user writing any custom code in a specific builder.

@shyamd
Copy link
Contributor Author

shyamd commented Jan 2, 2020

JointStore really should be called JoinedStore. It basically performs a join on the fly across multiple collections so you can distribute the data belonging to a specific key. For instance, breaking up a material into a core material collection plus: electronic_structure, XRD,etc.

ConcatStore concatenates store much like list concatenation. So you could have two tasks collections and treat them together as one.

@shyamd
Copy link
Contributor Author

shyamd commented Jan 2, 2020

It might make sense to drop JointStore since MongoDB has Views which basically do the same thing: https://docs.mongodb.com/manual/core/views/

@mkhorton
Copy link
Member

mkhorton commented Jan 2, 2020 via email

@shyamd
Copy link
Contributor Author

shyamd commented Jan 2, 2020

I think the View can be used as a normal Mongo Collection without write access.

@mkhorton
Copy link
Member

mkhorton commented Jan 2, 2020 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants