Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

check for newlines when concat fastas #52

Closed
hoelzer opened this issue Feb 17, 2022 · 6 comments · Fixed by #53
Closed

check for newlines when concat fastas #52

hoelzer opened this issue Feb 17, 2022 · 6 comments · Fixed by #53
Assignees

Comments

@hoelzer
Copy link
Member

hoelzer commented Feb 17, 2022

Command error:
  [E::fai_build_core] Different line length in sequence 'NC_008066.1'
  [faidx] Could not build fai index db.fa.gz.fai

This happend to me while using phix and providing two mtDNA FASTAs from NCBI. For DCS it was working though.

So I assume that the problem is that the integrated phiX FASTA has different line lengths as the two mtDNA FASTAs from NCBI

@hoelzer
Copy link
Member Author

hoelzer commented Feb 17, 2022

Hm, but actually the phix.fasta seems fine w/ the same line lengths... not sure

@hoelzer
Copy link
Member Author

hoelzer commented Feb 17, 2022

figured it:

>NC_053523.1 Gallus gallus isolate bGalGal1 mitochondrion, complete sequence, whole genome shotgun sequence
>NC_008066.1 Chlorocebus sabaeus mitochondrion, complete genome
CACGTTCCTCTTAAATAAGACATCTCGATG>gi|9626372|ref|NC_001422.1| Enterobacteria phage phiX174 sensu lato, complete genome

maybe it's good to add a newline to the FASTAs before cat!

@hoelzer
Copy link
Member Author

hoelzer commented Feb 17, 2022

DCS likely worked because it's not cated as the last file ;)

@MarieLataretu MarieLataretu self-assigned this Feb 18, 2022
@MarieLataretu
Copy link
Collaborator

AAAha, now I get it. There was no newline at the end of the custom FASTA, right?

@hoelzer
Copy link
Member Author

hoelzer commented Feb 18, 2022

AAAha, now I get it. There was no newline at the end of the custom FASTA, right?

yes! And that caused samtools to fail obviously.

(srry if this was unclear, did this yesterday while netflixing ;) )

So basically we could implement that for each own FASTA a user provides an additional newline is added, just in case.

@MarieLataretu
Copy link
Collaborator

yeah, we check for bgzip compression anyway, that can happen also there!

@MarieLataretu MarieLataretu changed the title Index fails when FASTAs have different line lengths check for newlines when concat fastas Feb 24, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants