Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Design data storage #12

Open
4 tasks
jjjermiah opened this issue Apr 9, 2024 · 2 comments
Open
4 tasks

Design data storage #12

jjjermiah opened this issue Apr 9, 2024 · 2 comments
Labels
feat New feature or request

Comments

@jjjermiah
Copy link
Contributor

  • need to figure out what to do with private data
  • need to figure out what to do with processed (now can be public) data
  • design of GCS storage
  • feat: programmatically get access to GCS Storage blobs used in each pipeline

Nice To Have:

  • being able to "containerize" or "code-ocean" an entire workflow after it is complete combined with the pipeline code to be downloaded via Zenodo alongside the final object.
@jjjermiah jjjermiah added the feat New feature or request label Apr 9, 2024
@jjjermiah
Copy link
Contributor Author

re: processed (public-use), creating a bucket at gs://orcestra-rawdata (can re-name if we come up with something better)

uploading data processed on H4H for gCSI there under gs://orcestra-rawdata/gCSI

Will then download files from here in pipelines.

bucket will be have public read permissions but only authenticated write permissions so anyone can download.

My only concern really is what if someone just downloads the same file like 1000 times and destroys our egress lol.

@jjjermiah
Copy link
Contributor Author

Image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feat New feature or request
Projects
Status: Todo
Development

No branches or pull requests

1 participant