Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiple instances of same modules all show same data #1539

Closed
epruesse opened this issue Aug 25, 2021 · 6 comments
Closed

Multiple instances of same modules all show same data #1539

epruesse opened this issue Aug 25, 2021 · 6 comments

Comments

@epruesse
Copy link
Contributor

Description of bug

Hi @ewels - am I the only one who runs the same tool multiple times and wants to see the logs visualised?

I am getting odd results with module_order repeating both the bowtie2 or fastqc modules - even though the logs disagree, every instance of each module shows the exact same data. This looks like a list or dict shared by module instances, maybe something that broke while trying to improve parsing speed? I tried with 1.10 and 1.11, both are showing the issue.

File that triggers the error

No response

MultiQC Error log

No response

@ewels
Copy link
Member

ewels commented Aug 30, 2021

Hi @epruesse,

Please can you upload a config and some example files so that I can replicate the problem. It works fine for me :)

Phil

@epruesse
Copy link
Contributor Author

epruesse commented Sep 10, 2021

Here is data for two fastqc runs:
multiqc_mve_issue_1539.zip

Run like this:

multiqc -f -c conf1.yaml before after

Whichever folder is parsed first, before or after, gets rendered for both instances of fastqc. I may be misreading the reports, but it's possible this is some over aggressive caching perhaps?

The conf1.yaml is just this:

run_modules:
- fastqc
sp: {}
module_order:
- fastqc:
    name: FastQC (before)
    path_filters: before/*_fastqc.zip
- fastqc:
    name: FastQC (after)
    path_filters: after/*_fastqc.zip

@epruesse
Copy link
Contributor Author

poking around the code - current hypothesis is that multiqc uses the filename as key in a dict somewhere, instead of the full path...

@epruesse
Copy link
Contributor Author

Argh. path_filters needs a list, not a string. It was reading all files for both modules.

  • Allow path_filters to be string as well, or bail out w/ malformed config
  • Warn if multiple log files for same module and sample loaded, w/ second not getting loaded?

@ewels
Copy link
Member

ewels commented Jan 27, 2022

oof, that's a nasty one! Yeah it will be iterating over the string and seeing if the file names match any of the individual characters. As one of the characters is * that means that it will match all files and not filter.

I'll add in a check for the config variable type and convert strings into lists.

@ewels ewels closed this as completed in 26565e1 Jan 27, 2022
@ewels
Copy link
Member

ewels commented Jan 27, 2022

Ok, updated in 26565e1 - that should mean that your original config now works as expected.

Thanks for reporting and digging into the underlying cause!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants