Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Internal barcodes #50

Open
apeltzer opened this issue Dec 14, 2020 · 5 comments
Open

Internal barcodes #50

apeltzer opened this issue Dec 14, 2020 · 5 comments

Comments

@apeltzer
Copy link

Hi @MikkelSchubert !

hope you're doing good - we're currently discussing a bit on how / whether AR2 is able to remove internal barcodes - is this currently supported ( I think not ?) and would it be something that could be added in a new release at some point?

x-ref / issue where we started discussing a little: nf-core/eager#632

@MikkelSchubert
Copy link
Owner

Hi,

I'm afraid that I'll have to ask you to clarify what you mean by internal barcodes in this context, as I am a bit rusty on the terminology.

Cheers

@apeltzer
Copy link
Author

Hi Mikkel!

I've asked the requester(s) to provide some insights for this :-)

@jfy133
Copy link

jfy133 commented Dec 17, 2020

Hi Mikkel,

To be able to measure barcode hopping on some machines, people have started ligating very short (~6-7bp) 'barcodes' directly onto the extracted DNA molecules, prior to adapter+index ligation.

image

Figure 1 of https://www.biorxiv.org/content/10.1101/179028v3.full.pdf

So in principle what this request would involve would be

  1. the initial removal of adapters,
  2. new a second pass of removal, to remove a second user-specified sequence.

As far as I know people typically only use a single barcode per sample out of a pool of maybe 12 barcodes. I guess if a user specifies these as a list (like with --adapter-list), this would be sufficient.

I guess in principle one could use the --identify-adapters functionality, but this doesn't actually do the trimming, and also the user should already know the actual barcode so for 'precision' it would make sense they can specifically define that.

Let me know if this is not clear...

Edit: to clarify as the barcodes are sample specific, you would have to allow the user to specify this as a list of possible barcode, in pipeline contexts (such as eager).

@MikkelSchubert
Copy link
Owner

Thank you for the detailed explanation!

Unless I am misunderstanding something, then barcodes of this type are already supported via the demultiplexing functionality. This is enabled when the user provides a table of sample names and barcodes with the --barcode-list option, such as these:

sample_1 ATGCGGA TGAATCT
sample_2 ATGGATT ATAGTGA
sample_7 CAAAACT TCGCTGC

The first column is used in output filenames, the second specifies the P7 barcode, and the third (optional) column specifies the P5 barcode. AdapterRemoval uses the barcodes to map reads/read pairs to samples, at which point the barcodes are removed from the 5' of each read. After that, adapter trimming is carried out using per-sample query sequences generated by merging the opposing barcode with the adapter sequence, so that both are trimmed from the reads.

There's a small example in the examples folder that you can run with

AdapterRemoval --file1 demux_1.fq --file2 demux_2.fq --basename output_demux --barcode-list barcodes.txt

It is also possible to just do the demultiplexing, if you want to do adapter trimming with a different trimmer. The combined barcode+adapter sequences are listed in the resulting settings files for each sample.

If I recall correctly, then it is currently possible to demultiplex using P7 barcodes or using P7 + P5 barcodes, but not P5 barcodes by themselves.

See here for more information:
https://adapterremoval.readthedocs.io/en/latest/examples.html#demultiplexing-and-adapter-trimming

@jfy133
Copy link

jfy133 commented Dec 17, 2020

Hi @MikkelSchubert ,

Ok, that does indeed sound like a possibility! I will investigate and see if we can get it to work as expected by the people who requested it, otherwise we will come back to you.

Cheers,

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants