Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve autocomplete #44

Closed
simone-pignotti opened this issue Oct 11, 2023 · 4 comments
Closed

Improve autocomplete #44

simone-pignotti opened this issue Oct 11, 2023 · 4 comments

Comments

@simone-pignotti
Copy link

Hi @gbouras13, I have encountered a case where there are multiple blast hits with very low evalue (1e-180) that just happened to miss the start of the target by few residues, so they were discarded and a random gene ("nearest" option) was used instead.
I think that in those cases it would be best to use the ORF that overlaps the best blast hit, and even make this the default when --autocomplete is not specified. What do you think?

@simone-pignotti
Copy link
Author

At least that's my interpretation based on:

for i in range(0, len(blast_df.qseq)):
if blast_df["qseq"][i][0] in ["M", "V", "L"] and (
blast_df["sstart"][i] == 1
):
reorient_sequence(blast_df, input, out_file, gene, i)

But correct me if I am wrong, I can take a closer look at the logs

@gbouras13
Copy link
Owner

I chose to have a strict match of the start codon in dnaapler to ensure the rotation does not interrupt a CDS, but I agree this would be a good feature.

I will aim to add this in the next week or two. It will require some refactoring and so isn't trivial but shouldn't be too hard.

George

@gbouras13
Copy link
Owner

Hi @simone-pignotti ,

I've made your suggestion the default within all BLAST based commands in dnaapler (aka it will take the top row of the blastx table, and if it the matching alignment doesn't start with a start codon, it will run pyrodigal to find the CDS with the most overlap).

Available in v0.4.0

George

@simone-pignotti
Copy link
Author

Wonderful, thank you so much @gbouras13!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants