Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

output of mask optimization is not FASTA #65

Closed
PavelVesely opened this issue Feb 1, 2024 · 5 comments
Closed

output of mask optimization is not FASTA #65

PavelVesely opened this issue Feb 1, 2024 · 5 comments

Comments

@PavelVesely
Copy link
Collaborator

Unlike running the masked superstring computation (global or local greedy), the output of kmercamel optimize is not a FASTA file, i.e., missing a header line. I think that for consistency, it should be the same.

@karel-brinda
Copy link
Collaborator

Agree, the first line can always be discarded.

@OndrejSladky
Copy link
Owner

Unlike running the masked superstring computation (global or local greedy), the output of kmercamel optimize is not a FASTA file, i.e., missing a header line. I think that for consistency, it should be the same.

How do you run the program?
If the input file is a fasta (i.e. it contains a header) the result is also a fasta.
If the input is not a fasta (which I assume is your use-case) then it is not a fasta. Which seems justified to me.
I could change it, but then it'll be difficult to maintain the same fasta header on input.

@PavelVesely
Copy link
Collaborator Author

How do you run the program? If the input file is a fasta (i.e. it contains a header) the result is also a fasta. If the input is not a fasta (which I assume is your use-case) then it is not a fasta. Which seems justified to me. I could change it, but then it'll be difficult to maintain the same fasta header on input.

I see, then it makes sense, and it won't occur in practice. I run kmercamel optimize on a text file, which is already without the header.
I'm closing this issue.

@PavelVesely
Copy link
Collaborator Author

Reopening this issue, as optimizing runs (runs or runsapprox) behaves inconsistently: Even though the input is a file with masked superstring but no header, the output actually does have a header. Optimizing ones or zeros doesn't add the header.

Here's a little experiment to verify:

$ head -c 50 <spneumoniae.S_global.k_9.d_na.M_default.maskedSuperstring.txt
GGCTCGACAAATTGATTAAGTACTCGTTGGTTACGTCGCTGTttatccCG

$ kmercamel/kmercamel optimize -k 9 -c -a ones -p spneumoniae.S_global.k_9.d_na.M_default.maskedSuperstring.txt -o spneu.k_9.ones.txt
$ head -c 50 <spneu.k_9.ones.txt
GGCTCGACAAATTGATTAAGTACTCGTTGGTTACGTCGCTGTTTaTccCG

$ kmercamel/kmercamel optimize -k 9 -c -a runs -p spneumoniae.S_global.k_9.d_na.M_default.maskedSuperstring.txt -o spneu.k_9.runs.txt
$ head -c 50 <spneu.k_9.runs.txt
> superstring
GGCTCGACAAATTGATTAAGTACTCGTTGGTTACGT

@PavelVesely
Copy link
Collaborator Author

Have fixed this in PR #70

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants