Skip to content

Commit

Permalink
added option to set the same seed for all replicates in readcount tables
Browse files Browse the repository at this point in the history
  • Loading branch information
idfarbanecha committed Jul 8, 2021
1 parent c3b5094 commit 43ca2f6
Show file tree
Hide file tree
Showing 2 changed files with 18 additions and 2 deletions.
3 changes: 2 additions & 1 deletion config.yml
Original file line number Diff line number Diff line change
Expand Up @@ -30,9 +30,10 @@ difference_ratio: "23:31:46" #Rate of substitution:insertion:deletion.
#For PacBio the default values is 6:50:54
#For Nanopore it is 23:31:46
##Replicate params
sd_rep: 0 #Standard deviation for varying the number of reads per genome between replicates,
sd_rep_reads: 0 #Standard deviation for varying the number of reads per genome between replicates,
#if 0 keep the same number of reads per genome for each replicate
#else vary the read number between replicates by sampling from a normal distribution
same_dist_rep: False #If True all replicates have the same read distribution (for log-normal distribution)
seed: 1 #Set this parameter to a fixed number so data from random number generators is reproducible

#Assembly finder parameters
Expand Down
17 changes: 16 additions & 1 deletion mess/scripts/simulate_reads.rules
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,22 @@ def get_input_value(path):
raise Exception(f'{lastcol} is not recognized as an input to simulate reads')


def get_read_counts_seed(rep, rep2seed, general_seed):
"""
Function that returns a seed for calculating read counts
By default each replicate has a unique seed
If the param same_dist_rep is True, return a single seed for all replicates
"""
try:
same_dist_rep = config['same_dist_rep']
if same_dist_rep:
return general_seed
except KeyError:
return rep2seed[int(rep)]


input_val = get_input_value(config['input_table_path'])

rule create_read_counts_table:
'''
Rule for generating a table per sample with read counts for each genome
Expand All @@ -48,7 +63,7 @@ rule create_read_counts_table:
output: rc_table='readcounts-{community}-{rep}.tsv'

params: value=input_val,
seed=lambda wildcards: seed_dic[int(wildcards.rep)]
seed=lambda wildcards: get_read_counts_seed(wildcards.rep, seed_dic, config['seed'])

log: "logs/read_counts_table/{community}-{rep}.txt"

Expand Down

0 comments on commit 43ca2f6

Please sign in to comment.