-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Which barcode-specific bam are used? #33
Comments
Hi Juan,
Cool! My experience with WDL/Cromwell isn’t quite proficient but I’d be curious to know how WDL works with AWS. I did hear that AWS was starting to support CWL although am unsure at which capacity.
For paired-end eCLIP, you’re correct that the eclipdemux step will produce several files, at which point you will want only the files associated with the expected barcode (and make sure most of the reads do end up getting binned here). For ENCODE, we did not assign barcodes to size-matched input samples, so all input samples are effectively unassigned (the designation we use is ‘NIL’) instead, though this is experiment specific.
You’re also correct that these files will be merged after PCR collapsing/deduplication. Then, R2 of the merged bam files will be used for peak calling with CLIPper. If the size-matched inputs lack inline barcodes, they may not need to be merged.
SECURE: MESSAGE FROM Juan Felipe Ortiz ON 9/20/22, 12:36 AM
Hello. First, thank you very much for the pipeline
I am in the process of implementing your pipeline in WDL (aiming to run it in our Cromwell server via AWS with infrastructure that requires WDL files). So far, I get most steps of the pipeline. However, it is not clear to me how the different fastQ files from the demultiplexed step are used.
As I understand it, after the demultiplexing step (running eclipdemux), I get a llist of files, one per barcode of the form *.BC.r1.fq.gz and *.BC.r2.fq.gz, where BC is each of the barcodes.
From what I can gather in the SOP<https://urldefense.com/v3/__https:/raw.githubusercontent.com/YeoLab/eclip/master/documentation/eCLIP_analysisSOP_v2.2.docx__;!!LLK065n_VXAQ!jgEJwqjfBnBnGKgFAK-UfwtQnhK1luXB1OUmn-EzUGv_8HT-M5O2LDA8oSEc9_NtBOtdxojSGHcIStNWkWW9sxv6pxv6$> , the rest of the steps are done starting with wach barcode-specific fastq (in the SOP, *CO1.r1.fq.gz).
My question is, should I merge these files at a prticular point in the pipeline? Should I merge the files of all the barcodes, or only those using the barcodeA and barcodeB?
Thank you
Juan Felipe Ortiz, Ph.D.
GeDaC. Cancer Sciences Institute
National University of Singapore
—
Reply to this email directly, view it on GitHub<https://urldefense.com/v3/__https:/github.com/YeoLab/eclip/issues/33__;!!LLK065n_VXAQ!jgEJwqjfBnBnGKgFAK-UfwtQnhK1luXB1OUmn-EzUGv_8HT-M5O2LDA8oSEc9_NtBOtdxojSGHcIStNWkWW9s_GoBOKe$>, or unsubscribe<https://urldefense.com/v3/__https:/github.com/notifications/unsubscribe-auth/AB7TJP6FWXTOWKLG4LPL7B3V7FSPZANCNFSM6AAAAAAQQZATXM__;!!LLK065n_VXAQ!jgEJwqjfBnBnGKgFAK-UfwtQnhK1luXB1OUmn-EzUGv_8HT-M5O2LDA8oSEc9_NtBOtdxojSGHcIStNWkWW9s0Vqfxcl$>.
You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>
|
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Hello. First, thank you very much for the pipeline
I am in the process of implementing your pipeline in WDL (aiming to run it in our Cromwell server via AWS with infrastructure that requires WDL files). So far, I get most steps of the pipeline. However, it is not clear to me how the different fastQ files from the demultiplexed step are used.
As I understand it, after the demultiplexing step (running eclipdemux), I get a llist of files, one per barcode of the form *.BC.r1.fq.gz and *.BC.r2.fq.gz, where BC is each of the barcodes.
From what I can gather in the SOP , the rest of the steps are done starting with barcode-specific fastq (in the SOP, *CO1.r1.fq.gz).
My question is, should I merge these files at a prticular point in the pipeline? Should I merge the files of all the barcodes, or only those using the barcodeA and barcodeB?
Thank you
Juan Felipe Ortiz, Ph.D.
GeDaC. Cancer Sciences Institute
National University of Singapore
The text was updated successfully, but these errors were encountered: