You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello again, I was wondering if you had a more elegant way of handling barcode sequencing errors. Currently, my snap object reports I have 85,228 unique barcodes. However, I know the number of true barcodes is 670.
Since I thought this was the result of sequencing errors in the barcode, I made a custom python script that adds the barcode to the front of the read header (because I'm starting from demultiplex'd fastq's) and any invalid barcodes (barcodes with >1bp mismatch) are renamed to INVALID.
When I perform simple filtering in SnapATAC (R), for a UMI count of 500 and a mit.ratio of <0.3 I lose almost all of these invalid barcodes (since there is a low chance of a sequencing error barcode having 500+ unique reads) and end up with ~1500 barcodes.
Is there a better way to handle these errors? Should I perform more stringent filtering? Here is what my current distribution looks like.
Thanks!
The text was updated successfully, but these errors were encountered:
Actually, I followed through the tutorial just a little bit more and found that the UMI cutoff should be much higher, after that, it is 580 barcodes, which is exactly in the range I expected. Thanks!
no problem. Another parameter for filtering barcodes is fragments-in-promoter ratio (0.2-0.8) which is also quite important, make sure check that when analyzing your data
--
Rongxin Fang
Ph.D. Student, Ren Lab
Ludwig Institute for Cancer Research
University of California, San Diego
On May 23, 2019, at 5:29 PM, Austin-s-h ***@***.***> wrote:
Actually, I followed through the tutorial just a little bit more and found that the UMI cutoff should be much higher, after that, it is 580 barcodes, which is exactly in the range I expected. Thanks!
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub <#12?email_source=notifications&email_token=ABT6GG3V6TOCR52HYFJ6M23PW4EB7A5CNFSM4HPKNOM2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODWDRJ6A#issuecomment-495391992>, or mute the thread <https://github.com/notifications/unsubscribe-auth/ABT6GGY2UCCFDMQ3WC6EBH3PW4EB7ANCNFSM4HPKNOMQ>.
Hello again, I was wondering if you had a more elegant way of handling barcode sequencing errors. Currently, my snap object reports I have 85,228 unique barcodes. However, I know the number of true barcodes is 670.
Since I thought this was the result of sequencing errors in the barcode, I made a custom python script that adds the barcode to the front of the read header (because I'm starting from demultiplex'd fastq's) and any invalid barcodes (barcodes with >1bp mismatch) are renamed to INVALID.
When I perform simple filtering in SnapATAC (R), for a UMI count of 500 and a mit.ratio of <0.3 I lose almost all of these invalid barcodes (since there is a low chance of a sequencing error barcode having 500+ unique reads) and end up with ~1500 barcodes.
Is there a better way to handle these errors? Should I perform more stringent filtering? Here is what my current distribution looks like.
Thanks!
The text was updated successfully, but these errors were encountered: