Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Frameshift detection #199

Closed
hoelzer opened this issue Jan 21, 2022 · 4 comments
Closed

Frameshift detection #199

hoelzer opened this issue Jan 21, 2022 · 4 comments
Assignees
Labels
enhancement New feature or request

Comments

@hoelzer
Copy link
Collaborator

hoelzer commented Jan 21, 2022

@replikation following up the discussion on BC: I think it's reasonable to put a terminal print warning about Omicron sequences missing many Spike mutations and showing sup basecalling and/or switching to Nanopolish as options to maybe fix that.

However, instead, we could also implement a general check for frameshifts (FS). For example, this could be done via https://gitlab.com/s.fuchs/covsonar in two easy steps:

All reconstructed consensus sequences can be added to a covSonar database:

sonar.py add -f genomes.fasta --db mydb --cpus 8

Then, we can query this database via

sonar.py match --db mydb --only_frameshifts | awk 'BEGIN{FS=","};{print $1}' | grep -v accession > ids-frameshift.txt

which will give back all sequence IDs that have a frameshift.

Now, we could additionally mark them in the report and/or print a message that one should be aware of that and maybe try basecalling with a higher accuracy model or switching to Nanopolish/Medaka. Or at least investigate the sequences if the frameshift is actually true.

This would also help people w/ subsequent analyses and e,g, GISAID upload, ...

What do you think?

@hoelzer hoelzer added the enhancement New feature or request label Jan 21, 2022
@MarieLataretu
Copy link
Collaborator

MarieLataretu commented Jan 21, 2022

(nextclade also reports the frameshifts in the frameShifts column)

@hoelzer
Copy link
Collaborator Author

hoelzer commented Jan 21, 2022

(nextclade also reports the frameshifts in the frameShifts color)

ah true - that we already have. ;) would be even easier than introducing another tool.

@MarieLataretu
Copy link
Collaborator

For the report I see two options:

@MarieLataretu
Copy link
Collaborator

I think, an additional column is the way to go.
@RaverJay can you add the frameShifts column to the report, if you all agree @replikation @hoelzer ?

It's on amino acid level, but good enough for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants