Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pacifica Software Inquiry #2

Closed
1 of 9 tasks
dmlb2000 opened this issue May 3, 2019 · 9 comments
Closed
1 of 9 tasks

Pacifica Software Inquiry #2

dmlb2000 opened this issue May 3, 2019 · 9 comments

Comments

@dmlb2000
Copy link

dmlb2000 commented May 3, 2019

Submitting Author: David Brown (@dmlb2000)
Repository Link (if existing): https://github.com/pacifica


  • Paste the full DESCRIPTION file inside a code block below:
Pacifica is an open source scientific data management platform for harvesting, validating, and distributing data and metadata. It is architected as a flexible set of inter-changeable tools used to build custom scientific data management solutions to meet the diverse changing demands of research at different institutions.

Scope

  • Please indicate which category or categories this package falls under:

    • Data retrieval
    • Data extraction
    • Data munging
    • Data deposition
    • Data visualization
    • Reproducibility
    • Geospatial
    • Education
    • Unsure/Other (explain below)
  • Explain how the and why the package falls under these categories (briefly, 1-2 sentences). Please note any areas you are unsure of:

Pacifica is geared toward scientific data management at the institutional level. Pacifica integrates a distributed set of micro-services that ingest, validate, archive and disseminate scientific data for projects within an institution. These goals seem to directly link up with the Data retrieval category.

  • Who is the target audience and what are scientific applications of this package?

Institutions that produce scientific data.

  • Are there other Python packages that accomplish the same thing? If so, how does yours differ?

Not that I'm aware of.

  • Any other questions or issues we should be aware of?:

We don't have a great web presence yet, we are working on it though.

P.S. Have feedback/comments about our review process? Leave a comment here

@kysolvik
Copy link
Contributor

kysolvik commented May 3, 2019

Hi @dmlb2000! Thanks for you presubmission inquiry. We'll discuss and get back to you

@kysolvik
Copy link
Contributor

Hi @dmlb2000, just wanted to check in and let you know we haven't forgotten! Be in touch soon. Sorry for the delay!

@dmlb2000
Copy link
Author

@kysolvik Saw the meeting notes where you brought up Pacifica. If you'd like a voice chat with me some time we can probably arrange that. Glad to see the chat on the meeting notes, it does give me some valuable input on web presence we need to work on.

I'm really happy you checked out the software and were able to have a good discussion about it.

@kysolvik
Copy link
Contributor

@dmlb2000 The big thing from the conversation was that pyOpenSci needs to define our scope in terms of what a "package" is. Something like Pacifica has a lot of components, and we were concerned we can't give it a thorough, high-quality review. So your inquiry created a really important conversation! We just haven't settled on that definition yet. Once we do, we'll update you and we'll let you know if we have questions in the meantime. Thanks again!

@dmlb2000
Copy link
Author

@kysolvik Yeah, technically from a Python perspective Pacifica is a namespace with many packages in it. Other large projects (Flask, Django, OpenStack) employ this same python feature...

@lwasser
Copy link
Member

lwasser commented May 20, 2019

@dmlb2000 we wanted to followup with you about this submission. It does seem like there are many sub modules in this submission. can you kindly help us understand how the submodules work together? We could see each being submitted individually o ris there a subset of these modules that makes sense for us to review first? it is challenging for a reviewing if there are too many components to a submission. and as @kysolvik mentioned above, we have yet to explicitly define what a package is. i'm pinging @leouieda and @luizirber on this as they and some others in our team had some good thoughts on how to break down what we review. any guidance that you can give is us appreciated.

We expect our reviewers to have expertise in python but not flask, django, openstack, etc. those would be out of scope give our current focus.

Also speaking of this would you have reviewers in mind that might be ideal for this submission?

@dmlb2000
Copy link
Author

@lwasser Pacifica's architecture is a set of independent micro services that communicate with one another to perform tasks required by an institutional data management system for research organizations.

So, each module in the namespace is a separate micro service that operates on scientific data. The archiveinterface interacts with a hierarchical storage management (mixture of ssd/disk/tape) system to store 10s of petabytes of scientific data. The cartd service pulls the data out of the archiveinterface and stages that data locally, allowing fast access by consumers. The metadata service collects and stores information about the scientific data (project, user, instrument, software links). The ingest service receives data sets and metadata in a single transaction and puts the data in the archiveinterface and the metadata in the metadata service. And there are many more...

The purpose behind each module (and Pacifica as a project) isn't to create some focused scientific data that provides new insights into expanding human knowledge. However, it is used to enable and accelerate that discovery at the institutional level. For example, It allows EMSL to have a strict data policy about using their facility.

I'm guessing what this comes down to is a question, "Does software designed and intended to be used as a foundation for which any science can be built on meet your criteria for inclusion?" That's kinda your way out if you aren't ready for this kind of application.

@lwasser
Copy link
Member

lwasser commented May 30, 2019

hi there @dmlb2000 thank you again for this submission. we discussed it today in our checkin and decided that while this might be in scope for pyOpenSci, we currently don't have the capacity to review it at this point given we are just getting started. Please ping us in the future as we will likely have much more capacity as the organization grows.

In the meantime please feel free to drop by a community meeting OR to follow other discussions in our discourse forum at any time!!

@lwasser
Copy link
Member

lwasser commented Aug 19, 2019

closing this given no response since may! we can reopen in the future if needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Development

No branches or pull requests

3 participants