-
Notifications
You must be signed in to change notification settings - Fork 165
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issues with using bc-geometry #670
Comments
Hi @curtisd0886 , Thanks for raising the issue, indeed it's weird and is not expected. Is it possible to share a subset of the data to help replicate the issue on my end and propose a solution ? |
Hi @curtisd0886, Indeed; thanks for sharing! @k3yavi -- I think we should take a look here and at the resulting implications. We've thus far had limited access to data with barcode lengths > 16, so I think we should try to evaluate if there are any other places we make such assumptions. |
Thanks for the quick reply! I have upload the first million reads as well as my zipped index folder. Let me know if you need anything else. https://drive.google.com/drive/folders/1asnEIn_J2WCjsxql3z5zfrfhxFpO-KGY?usp=sharing |
Hi @curtisd0886, So, issues relevant to processing this data should be resolved in the new release (v1.5.1). However, for technical reasons in the way different modes are handled internally, we had to simplify the mixing and matching of certain different options. Specifically, one can no longer use the --Rob |
Hi @rob-p , sudo ~/salmon-1.5.1_linux_x86_64/bin/salmon alevin -l ISR -i ~/Data/salmon/cell_hash -1 R1.fq.gz -2 R2.fq.gz --read-geometry 2[1-end] --bc-geometry 1[3-8,24-29,45-50] --umi-geometry 1[51-56] -o /home/cndd3/Data/Multi_3/hash_1.5.1/ --citeseq --featureStart 0 --featureLength 15 —keepCBFraction 1 I made sure to get rid of the --citeseq flag, but I am not sure if I am missing something else to get it working. Thanks for your help with this! |
Ok, I'm tagging @k3yavi since I believe he tested the hot fix with the data you shared. Hey may have some more insight on what's going on here. By the way, the command you quote above still contains the |
hey try the following command, I double checked on 1.5.1 and it seemed to give the 18 length CBs:
If the program is not exiting with error with the command you shared then probably there is some error on the update as it should throw error when you simultaneously provide with |
Sorry I copied over the old command and modified it forgetting to remove the --citeseq flag. When I actually used it with Salmon I made sure the --citeseq flag was not used. I am running it again using the command you recommended and will let you know how it works. Thanks again for your help. |
@curtisd0886 -- I think this may be my fault. I think the pre-compiled binary I uploaded may be cut from the wrong tag. Let me fix it and report back here. |
@rob-p -- not a problem at all. I can compile my own copy if you like, I was just in a rush and used the binary instead. |
Ok @curtisd0886, it should be fixed now! Sorry for the mixup. Everything else (bioconda, docker, etc.) were cut from the tag, but the pre-compiled excitable was mistakenly copied over from the master branch (before the changes were merged in) rather than the tag. I've updated the executable. |
Thank you guys. The new software did the trick and now I am getting 18 nt barcodes, however it appears that the mapping efficiency has gone down significantly. Previously it was about 8% of reads now it like 5.3e-5%. Any ideas where the issue might be? |
Hi @curtisd0886 , I noticed that as well, and my hunch is that it's because a lot cellular barcodes are getting filtered based on their frequency as the length of CB are increased. Probably worth providing externally the list of cellular barcodes to quantify using |
Closing this for lack of activity, but feel free to re-open if new discussion arises. |
We currently are using a protocol that uses a barcode strategy similar to Rhapsody, where you have 3, 8nt barcodes separated by two constant regions. I am trying to use the bc-geometry flag using the following command
" salmon alevin -l ISR -i ~/Data/salmon/cell_hash -1 R1.fq.gz -2 R2.fq.gz --umi-geometry 1[51-56] --bc-geometry 1[3-8,24-29,45-50] --read-geometry 2[1-end] -o outs/ --citeseq --featureStart 0 --featureLength 15"
I am getting an output table that appears to be mapping correctly, however the cell barcodes in the table are 16 nt long instead of the 18 nt specified in the bc-geometry command. Is there something I am missing?
The text was updated successfully, but these errors were encountered: