Skip to content

Commit

Permalink
setting default to fixed min_loci_corr=0.9 and max_pool_dist=0.1 as w…
Browse files Browse the repository at this point in the history
…ell as min_l_loci=20 and min_k_neighbours=5
  • Loading branch information
jeffersonfparil committed Feb 13, 2024
1 parent 833d4f5 commit e2efc0e
Show file tree
Hide file tree
Showing 10 changed files with 18,011 additions and 18,147 deletions.
16 changes: 8 additions & 8 deletions R/imputef.R
Original file line number Diff line number Diff line change
Expand Up @@ -133,13 +133,13 @@ mvi = function(fname,
#' @param frac_top_missing_loci
#' fraction of loci with the highest number of pools with missing data to be omitted. Set to zero if the input vcf has already been filtered and the loci beyond the depth thresholds have been set to missing, otherwise set to an decimal number between zero and one. [Default=0.0]
#' @param min_loci_corr
#' Minimum correlation (Pearson's correlation) between the locus requiring imputation and other loci deemed to be in linkage with it. Ranges from 0.0 to 1.0. If using the default value with is NA, then this threshold will be optimised to find the best value minimising imputation error. [Default=NA]
#' Minimum correlation (Pearson's correlation) between the locus requiring imputation and other loci deemed to be in linkage with it. Ranges from 0.0 to 1.0. If using the default value with is NA, then this threshold will be optimised to find the best value minimising imputation error. [Default=0.9]
#' @param max_pool_dist
#' Maximum genetic distance (mean absolute difference in allele frequencies) between the pool or sample requiring imputation and pools or samples deemed to be the closest neighbours. Ranges from 0.0 to 1.0. If using the default value with is NA, then this threshold will be optimised to find the best value minimising imputation error. [Default=NA]
#' Maximum genetic distance (mean absolute difference in allele frequencies) between the pool or sample requiring imputation and pools or samples deemed to be the closest neighbours. Ranges from 0.0 to 1.0. If using the default value with is NA, then this threshold will be optimised to find the best value minimising imputation error. [Default=0.1]
#' @param min_l_loci
#' Minimum number of linked loci to be used in estimating genetic distances between the pool or sample requiring imputation and other pools or samples. Minimum value of 1. [Default=1]
#' Minimum number of linked loci to be used in estimating genetic distances between the pool or sample requiring imputation and other pools or samples. Minimum value of 1. [Default=20]
#' @param min_k_neighbours
#' Minimum number of k-nearest neighbours of the pool or sample requiring imputation. Minimum value of 1. [Default=1]
#' Minimum number of k-nearest neighbours of the pool or sample requiring imputation. Minimum value of 1. [Default=5]
#' @param restrict_linked_loci_per_chromosome
#' Restrict the choice of linked loci to within the chromosome the locus requiring imputation belong to? [Default=TRUE]
#' @param n_reps
Expand Down Expand Up @@ -187,10 +187,10 @@ aldknni = function(fname,
max_depth_above_which_are_missing=1000000,
frac_top_missing_pools=0.0,
frac_top_missing_loci=0.0,
min_loci_corr=NA,
max_pool_dist=NA,
min_l_loci=1,
min_k_neighbours=1,
min_loci_corr=0.9,
max_pool_dist=0.1,
min_l_loci=20,
min_k_neighbours=5,
restrict_linked_loci_per_chromosome=TRUE,
n_reps=20,
n_threads=2,
Expand Down
6,749 changes: 3,366 additions & 3,383 deletions res/grape-Concordance.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
6,745 changes: 3,364 additions & 3,381 deletions res/grape-Mean_absolute_error.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
3,783 changes: 1,883 additions & 1,900 deletions res/lucerne-Concordance.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
3,993 changes: 1,988 additions & 2,005 deletions res/lucerne-Concordance_high_confidence_data.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
3,719 changes: 1,851 additions & 1,868 deletions res/lucerne-Mean_absolute_error.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
4,025 changes: 2,004 additions & 2,021 deletions res/lucerne-Mean_absolute_error_high_confidence_data.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
6 changes: 3 additions & 3 deletions res/perf_functions.R
Original file line number Diff line number Diff line change
Expand Up @@ -326,12 +326,10 @@ fn_test_imputation = function(vcf, mat_genotypes, mat_idx_high_conf_data, ploidy
fname_out_prefix=paste0("MVI-maf", maf, "-missing_rate", missing_rate, "-", rand_number_id),
n_threads=n_threads)
duration_mvi = difftime(Sys.time(), time_ini, units="mins")
### (2) Adaptive LD-kNN imputation using fixed min_loci_corr and max_pool_dist
### (2) Adaptive LD-kNN imputation using default fixed min_loci_corr and max_pool_dist at 0.9 and 0.1, respectively (where min_l_loci=20 and min_k_neighbours=5)
time_ini = Sys.time()
fname_out_aldknni_fixed = aldknni(fname=list_sim_missing$fname_vcf,
fname_out_prefix=paste0("AFIXED-maf", maf, "-missing_rate", missing_rate, "-", rand_number_id),
min_loci_corr=0.9,
max_pool_dist=0.1,
restrict_linked_loci_per_chromosome=restrict_linked_loci_per_chromosome,
n_threads=n_threads)
duration_aldknni_fixed = difftime(Sys.time(), time_ini, units="mins")
Expand All @@ -341,6 +339,8 @@ fn_test_imputation = function(vcf, mat_genotypes, mat_idx_high_conf_data, ploidy
fname_out_prefix=paste0("AOPTIM-maf", maf, "-missing_rate", missing_rate, "-", rand_number_id),
min_loci_corr=NA,
max_pool_dist=NA,
min_l_loci=1,
min_k_neighbours=1,
restrict_linked_loci_per_chromosome=restrict_linked_loci_per_chromosome,
n_threads=n_threads)
duration_aldknni_optim = difftime(Sys.time(), time_ini, units="mins")
Expand Down
3,607 changes: 1,795 additions & 1,812 deletions res/soybean-Concordance.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
3,515 changes: 1,749 additions & 1,766 deletions res/soybean-Mean_absolute_error.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit e2efc0e

Please sign in to comment.