Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add BDC dbGaP IDs #285

Merged
merged 8 commits into from
May 15, 2023
Merged

Add BDC dbGaP IDs #285

merged 8 commits into from
May 15, 2023

Conversation

gaurav
Copy link
Collaborator

@gaurav gaurav commented May 5, 2023

This PR adds the list of dbGaP identifiers that should be loaded into BDC Dug (https://renci.atlassian.net/browse/DUG-223?focusedCommentId=10809) to this repository, as well as a Bash script that can be used to download all these files. I kept running into errors with the FTP download command, but luckily dbGaP is simultaneously available over HTTPS, so this script now uses FTP to list all the data dictionaries in an FTP folder, and then uses HTTPS (via the Requests library) to actually download the files.

I know this is pretty messy, so please don't be shy about telling me specific pieces I should unmessify further! This should be pretty similar to the previous download, but if there are any significant differences and the downloaded dbGaP files can't be ingested by Roger, let me know and I'll fix it here.

@gaurav gaurav requested a review from YaphetKG May 5, 2023 22:09
@gaurav gaurav merged commit f25e74d into develop May 15, 2023
@gaurav gaurav deleted the add-bdc-dbgap-ids branch May 15, 2023 15:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants