-
Notifications
You must be signed in to change notification settings - Fork 52
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] Duplicated metadata when querying metadata for single run accession #89
Comments
Thanks for the bug report @kpj! I think the reason this bug results in two runs is because that happens when you also search it via the NCBI-SRA website. For example see: https://www.ncbi.nlm.nih.gov/sra/?term=SRR12169246 |
Thanks! I came across a similar issue when fetching metadata manually and ended up subsetting the dataframe. Maybe there's a better of way of handling this. |
For now, I would recommend the fix you have in place. It is slightly tricky to deal this internally given the passed in argument could be anything (SRP/SRR/SRX/GSM etc.). The origin of this is not at pysradb end, but what NCBI search itself returns (see above comment) |
Is the main issue to figure out which column to detect duplicates in/which column to select the accessions from? This is certainly not very elegant and maybe there are other issues making this more difficult, so I am happy either way :) |
I met the same question. And I am confused about the relationship between multiple SRR IDs within a single SRX ID. Are these SRR IDs technical replicates from a shared sequencing library? |
Yes, SRRs for the same SRX are technical replicates. Here are some slides that might help: https://f1000research.com/slides/8-1183 |
Many thanks for your quick reply!! In passing, I would like to raise here another problem that I encountered in the course of using. The metadata I prefetch by For example, I want to acquire antibody info of a ChIPseq ([SRX027872](https://www.ncbi.nlm.nih.gov/sra/SRX027872[accn])). On the web of NCBI, I can see the antibody info ( |
@sheep-liu thanks for brining it to my attention. I have pushed 7da562f which enables fetching experiment protocol. It will be in the next release (you can install the develop version from github for now). For future, please create a new issue. I will close this for now as I think the original issue it is best handled downstream. |
Roger! And thanks a lot. |
Describe the bug
In some cases, when using
SRAweb.sra_metadata
with a single run accession, multiple metadata rows are returned. It would seem more sensible to only return the metadata for the requested run accession.This is e.g. problematic when retrieving metadata for a list of samples and expecting the number of rows to be equal to the number of queried samples.
To Reproduce
Execute the following code:
Desktop:
Linux
3.8.5
0.11.2-dev0
The text was updated successfully, but these errors were encountered: