Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

--keepDupes and -F 0x400 gives me different results #58

Closed
crazyhottommy opened this issue Apr 18, 2018 · 8 comments
Closed

--keepDupes and -F 0x400 gives me different results #58

crazyhottommy opened this issue Apr 18, 2018 · 8 comments
Assignees
Labels

Comments

@crazyhottommy
Copy link

Hi,

I have some single end RRBS-seq data. I aligned with bwa-meth and marked the duplicates with samblaster.

one expects to see a lot of duplicated read for RRBS data, because the enzyme cutting site is the same.

MethylDackel (0.3.0-3-g084d926) gives me quite different results specifying --keepDupes or -F 0x400. Are these two options the same?

thanks,
Tommy

@dpryan79
Copy link
Owner

-F 0x400 is exactly the opposite of --keepDupes. -F in MethylDackel is the same as -F in samtools view, so you're then instructing it to ignore duplicates (but then include secondary and supplemental alignments, since you've overridden the default value of 0xF00 or 3840).

@crazyhottommy
Copy link
Author

Thanks. so, If I need to include the duplicates, I can do:

MethylDackel --keepDupes

# or

MethylDackel -F 0x100 -F 0x200 -F 0x800

what if one uses both --keepDupes and -F together? and the order of the flag also matters seems.

e.g. This seems to give different results.

MethylDackel -F 0x400  --keepDupes
MethylDackel   --keepDupes -F 0x400

Thanks,
Tommy

@dpryan79
Copy link
Owner

The -F option is processed first when reads are filtered, so if you filter it then --keepDupes is ignored. This looks like a bug, actually, since I'm now not sure that --keepDupes is properly doing anything since it isn't changing the -F default. I think I should change the -F default to 0xB00, I'll have to test that.

@dpryan79 dpryan79 self-assigned this Apr 19, 2018
@dpryan79 dpryan79 added the bug label Apr 19, 2018
@crazyhottommy
Copy link
Author

Thanks for the detail. I just tested

MethylDackel -F 0x100 -F 0x200 -F0x800
MethylDackel   --keepDupes  -F 0x100 -F 0x200 -F0x800

I thought these two are the same. -F 0x100 -F 0x200 -F0x800 will keep the duplicates, but one needs to specify --keepDupes as well for the counting.

For my purpose, I will need to do to keep the duplicates but discarding failed qc, secondary and supplementary reads:

MethylDackel   --keepDupes  -F 0x100 -F 0x200 -F0x800

right?

Thanks,
Tommy

@dpryan79
Copy link
Owner

-F 0xB00 rather than multiple -F values, but otherwise yes. I'll make that the default in the next bug fix version.

@crazyhottommy
Copy link
Author

great! thanks for the explaining and user support!

@dpryan79
Copy link
Owner

Thanks for reporting the bug!

dpryan79 added a commit that referenced this issue Apr 13, 2019
@dpryan79
Copy link
Owner

Sorry for the delay on this, it's now fixed in the 0.4.0 release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants