Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

processBismarkAln output changed in recent versions #334

Open
hennion opened this issue Dec 10, 2024 · 2 comments
Open

processBismarkAln output changed in recent versions #334

hennion opened this issue Dec 10, 2024 · 2 comments

Comments

@hennion
Copy link

hennion commented Dec 10, 2024

The output from processBismarkAln changed with recent versions with strange lines that generate duplicated lines in corresponding bedgraph.

The bug can be reproduced with the attached minimal bam file. A visualization with IGV is also included.
bug_report_igv
reduced.sam.txt
The code is very simple:

file.sample <- "pathTo/reduced.bam"
myobj <- processBismarkAln(location=file.sample,
                        sample.id="test",
                        assembly="mm39",
                        save.folder=NULL,
                        save.context=NULL,
                        read.context="CpG",
                        nolap=FALSE,
                        mincov=1,
                        minqual=20,
                        phred64=FALSE,
                        treatment= 0)
write.csv(myobj2, "methylkit1.32.csv",  row.names=FALSE)                        
bedgraph(myobj,
         file.name = "methylkit1.32.bedgraph",
         col.name="perc.meth",
         unmeth = FALSE,
         log.transform = FALSE,
         negative = FALSE,
         add.on = "",
         chunk.size = 1e+06)

Here are the output :

  • for version 1.32 (and 1.28)
[mag @ BI-platform 12:37]$ ~ : cat methylkit1.32.csv 
"chr","start","end","strand","coverage","numCs","numTs"
"chr19",3079413,3079413,"+",2,0,2
"chr19",3079414,3079414,"-",1,0,1
"chr19",3079444,3079444,"+",3,0,3
"chr19",3079445,3079445,"-",4,1,3
"chr19",3079524,3079524,"-",1,0,1
"chr19",3079525,3079525,"+",1,0,1
"chr19",3079526,3079526,"+",1,0,1
"chr19",3079526,3079526,"-",3,1,2
"chr19",3079663,3079663,"+",1,0,1
"chr19",3079664,3079664,"-",3,2,1
[mag @ BI-platform 12:37]$ ~ : cat methylkit1.32.bedgraph 
track type=bedGraph name='testperc.meth' description='testperc.meth' visibility=full color=255,0,0 maxHeightPixels=80:80:11  
chr19   3079412 3079413 0
chr19   3079413 3079414 0
chr19   3079443 3079444 0
chr19   3079444 3079445 25
chr19   3079523 3079524 0
chr19   3079524 3079525 0
chr19   3079525 3079526 0
chr19   3079525 3079526 33.3333333333333
chr19   3079662 3079663 0
chr19   3079663 3079664 66.6666666666667

In this bedgraph I have two lines for the coordinates "chr19 3079525 3079526" with a different meth perc value... This seems problematic (and prevents from building a bigwig).

  • for version 1.20
[mag @ BI-platform 12:41]$ ~ : cat methylkit1.20.csv 
"chr","start","end","strand","coverage","numCs","numTs"
"chr19",3079413,3079413,"+",2,0,2
"chr19",3079414,3079414,"-",1,0,1
"chr19",3079444,3079444,"+",3,0,3
"chr19",3079445,3079445,"-",4,1,3
"chr19",3079525,3079525,"+",2,0,2
"chr19",3079526,3079526,"-",4,1,3
"chr19",3079663,3079663,"+",1,0,1
"chr19",3079664,3079664,"-",3,2,1
[mag @ BI-platform 12:42]$ ~ : cat methylkit1.20.bedgraph 
track type=bedGraph name='testperc.meth' description='testperc.meth' visibility=full color=255,0,0 maxHeightPixels=80:80:11  
chr19   3079412 3079413 0
chr19   3079413 3079414 0
chr19   3079443 3079444 0
chr19   3079444 3079445 25
chr19   3079524 3079525 0
chr19   3079525 3079526 25
chr19   3079662 3079663 0
chr19   3079663 3079664 66.6666666666667

Which seems correct to me. Maybe this change was made on purpose but I don't understand it, and I'm worried about the rest of my analyses (+ I can't generate bigwig for visualisation) with the most recent versions.
Thank you for your help.

Magali

@alexg9010
Copy link
Collaborator

Hi @hennion,

Thanks for reporting this issue! I will have a closer look at this using your test data.

Best,
Alex

@alexg9010
Copy link
Collaborator

alexg9010 commented Dec 12, 2024

Hi @hennion,

I think I found the commit that introduced this bug, and I am working on a patch.
Unfortunately, this change did not lead to any difference in output for our test data, so it was not caught. But given your example, I will make sure to update the tests.

Best,
Alex

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants