Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

best practises config #254

Open
alexander-e-f-smith opened this issue Oct 3, 2024 · 3 comments
Open

best practises config #254

alexander-e-f-smith opened this issue Oct 3, 2024 · 3 comments

Comments

@alexander-e-f-smith
Copy link

Hi. I'm just updating our arriba/fusion pipeline set-up and reassessing our setup for arriba/star alignment configuration. Currently I have used the suggested parameters in the Arriba manual, changing /adding anything I know relates to my pipeline/data specifically in terms of Arriba options. In terms of star alignment options, I haven't customized much outside of those you recommend in the manual but wondered if you had example scenarios/knowledge of parameters that may often need customization (input dependant most likely). Thanks for any pointers/help!

@suhrig
Copy link
Owner

suhrig commented Oct 11, 2024

Apologies for the delay. The default parameters for Arriba and the customized parameters for STAR are what is recommended for most scenarios. I try to make the tool as straightforward to use as possible, auto-detecting and auto-tuning it's parameters. Only if you're using special datatypes, such as targeted sequencing or long read sequencing is when you need to adjust something. Another thing that occasionally improves results is trimming. This is not handled by Arriba/STAR and adapters may pose a problem to alignment if your read-length exceeds the fragment size. In those cases, the reads should be trimmed before alignment. Let me know if any of this applies to you and I'm happy to provide further pointers to detailed instructions.

@alexander-e-f-smith
Copy link
Author

Thanks for the response. This is part of a targeted (single primer targeting/extension) with UMIs. - we deduplicate and trim amongst other things prior to fusion calling, as per our data type. We had validated this against many fusions (successfully!), but of note for certain data's, arriba shows less fusion support than Starfusion. I also noted such things as starfusion now recommends not doing two-pass star alignment, so wondered if we needed to revisit our pipeline/arriba configuration too as we are doing further pipeline update work currently.

@suhrig
Copy link
Owner

suhrig commented Oct 16, 2024

but of note for certain data's, arriba shows less fusion support than Starfusion.

This could be because Arriba only counts high-quality alignments in the columns split_reads1/2 and discordant_mates. There could be additional reads in the column filters in parentheses.

starfusion now recommends not doing two-pass star alignment

In my experience, two-pass alignment doesn't improve fusion calling. It also doesn't hurt, however. It simply adds CPU time, which is why I don't recommend it for fusion calling and STAR-Fusion may recommend against it for the same reason. If you want to use the alignments for other purposes as well, then two-pass mode is probably a good idea. Especially if you are interested in novel splice sites.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants