best practises config #254

alexander-e-f-smith · 2024-10-03T10:01:14Z

Hi. I'm just updating our arriba/fusion pipeline set-up and reassessing our setup for arriba/star alignment configuration. Currently I have used the suggested parameters in the Arriba manual, changing /adding anything I know relates to my pipeline/data specifically in terms of Arriba options. In terms of star alignment options, I haven't customized much outside of those you recommend in the manual but wondered if you had example scenarios/knowledge of parameters that may often need customization (input dependant most likely). Thanks for any pointers/help!

suhrig · 2024-10-11T13:59:42Z

Apologies for the delay. The default parameters for Arriba and the customized parameters for STAR are what is recommended for most scenarios. I try to make the tool as straightforward to use as possible, auto-detecting and auto-tuning it's parameters. Only if you're using special datatypes, such as targeted sequencing or long read sequencing is when you need to adjust something. Another thing that occasionally improves results is trimming. This is not handled by Arriba/STAR and adapters may pose a problem to alignment if your read-length exceeds the fragment size. In those cases, the reads should be trimmed before alignment. Let me know if any of this applies to you and I'm happy to provide further pointers to detailed instructions.

alexander-e-f-smith · 2024-10-16T09:43:24Z

Thanks for the response. This is part of a targeted (single primer targeting/extension) with UMIs. - we deduplicate and trim amongst other things prior to fusion calling, as per our data type. We had validated this against many fusions (successfully!), but of note for certain data's, arriba shows less fusion support than Starfusion. I also noted such things as starfusion now recommends not doing two-pass star alignment, so wondered if we needed to revisit our pipeline/arriba configuration too as we are doing further pipeline update work currently.

suhrig · 2024-10-16T10:23:34Z

but of note for certain data's, arriba shows less fusion support than Starfusion.

This could be because Arriba only counts high-quality alignments in the columns split_reads1/2 and discordant_mates. There could be additional reads in the column filters in parentheses.

starfusion now recommends not doing two-pass star alignment

In my experience, two-pass alignment doesn't improve fusion calling. It also doesn't hurt, however. It simply adds CPU time, which is why I don't recommend it for fusion calling and STAR-Fusion may recommend against it for the same reason. If you want to use the alignments for other purposes as well, then two-pass mode is probably a good idea. Especially if you are interested in novel splice sites.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

best practises config #254

best practises config #254

alexander-e-f-smith commented Oct 3, 2024

suhrig commented Oct 11, 2024

alexander-e-f-smith commented Oct 16, 2024

suhrig commented Oct 16, 2024

best practises config #254

best practises config #254

Comments

alexander-e-f-smith commented Oct 3, 2024

suhrig commented Oct 11, 2024

alexander-e-f-smith commented Oct 16, 2024

suhrig commented Oct 16, 2024