Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

smudgeplot.py hetmers not producing kmerpairs_text.smu (.trim.ktab only) #179

Open
gemmacol opened this issue Dec 3, 2024 · 6 comments
Open
Assignees
Labels
bug Something isn't working genomescope included

Comments

@gemmacol
Copy link

gemmacol commented Dec 3, 2024

hetmers step has completed, but not resulting in .smu file?

smudgeplot.py hetmers -L 12 -t 4 -o kmerpairs -tmp $TMP --verbose Fastk_Table

Running smudgeplot v0.4.0dev
Task: hetmers
Calling: hetmers (PloidyPlot kmer pair search) -okmerpairs -e12 -T4 -v -P/nesi/nobackup/ga02470/acanthoxyla/github/new-illumina-smudgeplots/AXG/fastk2/tmp Fastk_Table

The input table is untrimmed and not symmetric

Trimming k-mers in table with count < 12
Output table ./.trim.ktab already exists, continue? yes

Making trimmed table symmetric

Starting to count covariant pairs

Done!

This outputs a file called: .symx.ktab
But not the .smu file that I was expecting.

I thought maybe we don't have the coverage for the -L 12 option so tried also -L 6 and same result.

Any suggestions why I am not generating the .smu file?

Options used for prior step:
FastK -v -t4 -k21 -M16 -T4 -P$TMP -c AXG_novaseq_R*.fq -NFastk_Table

@KamilSJaron
Copy link
Owner

Hi Gemma, the expected log looks like this:

Running smudgeplot v0.4.0dev
Task: hetmers
Calling: hetmers (PloidyPlot kmer pair search)  -otest -e10 -T4 -v Fastk_Table

  The input table is untrimmed and not symmetric

  Trimming k-mers in table with count < 10

  Making trimmed table symmetric

  Starting to count covariant pairs

  Count complete, plotting

  About to save stuff

  Saving stuff

Done!

Yours is missing

  Count complete, plotting

  About to save stuff

  Saving stuff

Done!

I am not sure why, but I suspect it's something wrong with the k-mer database. If you make a k-mer histogram (that should take just seconds), does it look sane? Something like this should do the job...

Histex -G Fastk_Table > kmer.hist

@gemmacol
Copy link
Author

gemmacol commented Dec 6, 2024

Hi Kamil,

I tried your suggestion: Histex -G Fastk_Table > kmer.hist
Then plotted kmer.hist with genomescope.
The model didn't fit well, but as far as I can tell it looks "normal"

linear_plot

Do you have any further suggestions what I could try, to generate the .smu file?

So far I have these intermediate files in the dir:

Fastk_Table.hist
Fastk_Table.ktab

.Fastk_Table.ktab.2
.Fastk_Table.ktab.4
.Fastk_Table.ktab.1
.Fastk_Table.ktab.3

..symx.ktab.1
.symx.ktab

@KamilSJaron KamilSJaron added bug Something isn't working genomescope included labels Dec 6, 2024
@KamilSJaron
Copy link
Owner

@gemmacol That's puzzing. With Gene we can't think of a good reason this could/should happen. Did you by any chance have anything streamed to the stderr? @thegenemyers thinks there should be something, it should not crush silently.

Would you be able to upload for us the .ktab as well as all the .Fastk_Table.ktab.? We are unable to figure out what exactly is wrong with this with the information we have...

@gemmacol
Copy link
Author

gemmacol commented Dec 8, 2024

I have re-created the issue this time with a single slurm script and here provide the two files you asked for as well as the script and the log files gathered after re-running it:

  1. This is the .symx.ktab : symx.zip

  2. This is FastK_Table.ktab (the hidden files were too large to upload, this is what you were asking for?) : FastK_Table.zip

  3. here is the exact script that was used:
    AXG.52229199.zip

And here a screenshot of the resulting files in time-reversed order (.smu still missing):
image

Many thanks,
Gemma

@KamilSJaron
Copy link
Owner

I am afraid the hidden files are needed for the full k-mer table. @thegenemyers?

@gemmacol
Copy link
Author

gemmacol commented Dec 9, 2024

Here are the hidden files as a google drive link: https://drive.google.com/drive/folders/12keGxYN5jkY8t--Rksq_sZ3FrcXXnDuj?usp=sharing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working genomescope included
Projects
None yet
Development

No branches or pull requests

3 participants