Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cellect-genes.snakefile error (issue with the cellect_magma subworkflow) #81

Open
pascaltimshel opened this issue Jul 10, 2022 · 1 comment
Assignees

Comments

@pascaltimshel
Copy link
Collaborator

pascaltimshel commented Jul 10, 2022

Problem description

When running cellect-genes.snakefile the get_corrected_pvals does not get executed in cellect_magma sub workflow. This leads to the error No rule to produce...resid_correct_all.gsa.genes.pvals.out.

I speculate the error has to do with a config file being loaded again in the cellect_magma subworkflow.

Error message

(py3_PT) pnt@hydra:/opt/software/pnt/CELLECT> snakemake --use-conda -j -s cellect-genes.snakefile --configfile config-DUMMY.yml
Loaded config file from --configfile argument
Building DAG of jobs...
Executing subworkflow cellect_magma.
Loading default config file: config.yml
Building DAG of jobs...
MissingRuleException:
No rule to produce /XXX/CELLECT_out/CELLECT-MAGMA/precomputation/COPD_Sakornsakolpat2019/COPD_Sakornsakolpat2019.resid_correct_all.gsa.genes.pvals.out (if you use input functions make sure that they don't raise unexpected exceptions).

Temporary solution

The command snakemake --use-conda -j -s cellect-genes.snakefile --configfile config-DUMMY.yml will run without error when the subworkflow cellect_magma have been out-commented from cellect-genes.snakefile (https://github.com/perslab/CELLECT/blob/master/cellect-genes.snakefile#L92-L94) and get_corrected_pvals have been copied into the workflow:

# Outcomment 
# subworkflow cellect_magma:
#     snakefile:
#         "cellect-magma.snakefile"

# Copy rule from cellect_magma
rule get_corrected_pvals:
        '''
    Calculate gene-based p-values (corrected) from the corrected Z-stats. These are used only for cellect-genes, while prioritizing is done with Z stats. (results would be identical of course)
        '''
        input:
            "{BASE_OUTPUT_DIR}/precomputation/{gwas}/{gwas}.resid_correct_all.gsa.genes.out"
        output:
            "{BASE_OUTPUT_DIR}/precomputation/{gwas}/{gwas}.resid_correct_all.gsa.genes.pvals.out"
        conda:
            "envs/cellectpy3.yml"
        params:
            gwas_name = lambda wildcards: GWAS_SUMSTATS[wildcards.gwas]['id'],
            base_output_dir = "{BASE_OUTPUT_DIR}"
        script:
            "scripts/convert_zstat_to_pval.py"

# Single line modification to this rule to remove dependency of cellect_magma 
rule get_effector_genes:
        '''
        Extract top percentile of ESmu > 0.0 genes 
        Extract top n MAGMA genes
        Take intersection
        '''        
        input: 
                expand("{BASE_OUTPUT_DIR}/precomputation/{gwas}/{gwas}.resid_correct_all.gsa.genes.pvals.out", BASE_OUTPUT_DIR=BASE_OUTPUT_DIR,gwas=list(GWAS_SUMSTATS.keys()))    
            # cellect_magma(expand("{BASE_OUTPUT_DIR}/precomputation/{gwas}/{gwas}.resid_correct_all.gsa.genes.pvals.out", BASE_OUTPUT_DIR=BASE_OUTPUT_DIR,gwas=list(GWAS_SUMSTATS.keys()))) 
        output:
                "{CELLECT_GENES_OUTPUT_DIR}/out/{run_prefix}__{gwas}.effector_genes.txt"
        conda:
                "envs/cellectpy3.yml"
        log:
                "{CELLECT_GENES_OUTPUT_DIR}/logs/log.get_effector_genes_snake.{run_prefix}.{gwas}.txt"
        params: 
                magma_output_dir = BASE_OUTPUT_DIR,
                cellect_genes_output_dir = CELLECT_GENES_OUTPUT_DIR,
                specificity_matrix_file = lambda wildcards: SPECIFICITY_INPUT[wildcards.run_prefix]['path'],
                specificity_matrix_name = "{run_prefix}",
                gwas_name = lambda wildcards: GWAS_SUMSTATS[wildcards.gwas]['id'],
                n_genes_magma = config['N_GENES_MAGMA'],
                percentile_cutoff_esmu = config['PERCENTILE_CUTOFF_ESMU']
        script: 
                "scripts/get_effector_genes.py"

@pascaltimshel
Copy link
Collaborator Author

@JonThom or @Tobi1kenobi : do you have any ideas for how to solve this?

@pascaltimshel pascaltimshel changed the title cellect-genes.snakefile error: cellect_magma subworkflow cellect-genes.snakefile error (issue with the cellect_magma subworkflow) Jul 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants