Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automatic check of links in download files #1

Open
fbastian opened this issue Jun 11, 2018 · 3 comments
Open

Automatic check of links in download files #1

fbastian opened this issue Jun 11, 2018 · 3 comments
Assignees
Milestone

Comments

@fbastian
Copy link
Member

Implement an automatic verification of all links provided in the download files (we had problems of outdated URLs, or of missing files, that we only discovered after the files were released).

@fbastian
Copy link
Member Author

Also, I see that we use SRA IDs to link to GEO in download files, but this doesn't work, see e.g. link http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=ERX012344 in ftp://ftp.bgee.org/bgee_v14_0/download/processed_expr_values/rna_seq/Mus_musculus/Mus_musculus_RNA-Seq_experiments_libraries.tar.gz, which should actually be https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE30617 from what I understand.

Do we have the information necessary during download file generation to fix this problem @smoretti?

@smoretti
Copy link
Member

We use SRX, ERX and DRX identifiers to ease download with the sra_toolkit.
So the direct link for those identifiers should be https://www.ncbi.nlm.nih.gov/sra/?term=ERX012344

@smoretti
Copy link
Member

Column with GEO link should be removed to make the file more simple.

Only SRA link (after URL correction) should remain.

@smoretti smoretti added this to the bgee_v15 milestone Jul 30, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants