-
Notifications
You must be signed in to change notification settings - Fork 603
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add distinct minimizer heatmap for Kraken2 #1380
Conversation
I didn't see it before, but this should solve #1379 |
I could have sworn I added Black CI to MultiQC, but I can't see it now. Could you please run Also, please some test data to https://github.com/ewels/MultiQC_TestData/tree/master/data/modules/kraken for this. Otherwise, LGTM from quick read of the code 👍🏻 Will review again once we have test data. |
@maxibor Testing with my data, nice plots! (now I need to figure them out) BTW, the unclassified disappeared from the top 5 taxa plot. Was it intended? |
Another issue is that it crashes when the reports are mixed (I mean, std and with minimizers in the same folder). Sorry, I was being picky with my tests for #1379 😄 |
I've added test data with this PR MultiQC/test-data#187 |
Merge conflicts after merging #1347 - will take a look now. |
ok, conflicts fixed and code working. I can see that the TestData sample
Thanks all! Phil |
@aeu79 - this warning was unrelated to the changes in this PR. It's a warning from matplotlib due to the core MultiQC code. It's only just started appearing due to an update in their end. I suspect it only showed for you here because you were running with more samples, triggering MultiQC to generate flat image bargraphs instead of interactive. Anyway, it should now be fixed in |
|
Thanks @maxibor! I brought back my check to avoid repeated noticed about mixed reports, which had gotten lost. I'm running tests on the data in MultiQC_TestData > I'll take a look now, but generally speaking it's good to also try running on just this data as then you'll come across these same issues as me and can probably save some time 😉 |
Ok added an extra loop in c2faaeb to strip out any heatmap samples with zero values. This fixes the report for me and I now get a heatmap with two cells. I also tweaked the heatmap config as I figure it doesn't need to be square (the x- and y-categories are not the same) and I also specified that the xcats are not sample names, so that they are not affected by hiding / renaming / highlighting in the report. I'm happy with this so will merge as soon as the tests are passing - thank you! 👍🏻 (especially for speedy response after the long delay). Please check that you're ok with my changes and let me know ASAP if not as we can fix in a follow-up PR. Phil |
@aeu79 - note that your updated regex from #1383 is now in this PR also, so should hopefully address your comment: #1383 (comment) Shout if there's anything missing still. Phil |
Since version 2.1.0, Kraken2 integrated the distinct minimizer count ability of KrakenUniq
This new function comes with a slightly altered report (see Kraken2 docs) format.
This PR adds a heatmap of the duplication rate of the minimizers (i.e. kmers) which is very useful to detect read stacking, and false positive hits.
PR checklist:
CHANGELOG.md
has been updated