-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
5a83cf1
commit 5ad78b3
Showing
44 changed files
with
102,546 additions
and
31 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,22 @@ | ||
name: 🚀 | ||
on: | ||
push: | ||
branches: [ "main" ] | ||
pull_request: | ||
branches: [ "main" ] | ||
jobs: | ||
build: | ||
runs-on: ubuntu-latest | ||
steps: | ||
- uses: actions/checkout@v3 | ||
- uses: r-lib/actions/setup-r@v2 | ||
- name: Install dependencies on Ubuntu | ||
run: | | ||
sudo apt install -y libcurl4-openssl-dev libharfbuzz-dev libfribidi-dev | ||
- name: Install dependencies | ||
run: | | ||
install.packages(c("devtools", "rextendr", "testthat")) | ||
shell: Rscript {0} | ||
- name: Tests | ||
run: | | ||
Rscript tests/tests.R |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,14 @@ | ||
Package: imputef | ||
Title: Imputing allele frequencies for individual polyploid genotype data and pools of individuals or population genotype data | ||
Version: 0.0.0.1 | ||
Authors@R: | ||
person("Jeff", "Paril", , "[email protected]", role = c("aut", "cre", "mai"), | ||
comment = c(ORCID = "0000-0002-5693-4123")) | ||
Description: Imputation of genotype data from sequencing of more than 2 sets of genomes, i.e. polyploid individuals, population samples, or pools of individuals. This library can also perform simple genotype data filtering prior to imputation. Two imputation methods are available: (1) mean value imputation which uses the arithmentic mean of the locus across non-missing pools (`?imputef::mvi`); (2) adaptive linkage-informed k-nearest neighbour imputation (`?imputef::aldknni`). This is an attempt to extend the [LD-kNNi method of Money et al, 2015, i.e. LinkImpute](https://doi.org/10.1534/g3.115.021667), which was an extension of the [kNN imputation of Troyanskaya et al, 2001](https://doi.org/10.1093/bioinformatics/17.6.520). Similar to LD-kNNi, LD is estimated using Pearson's product moment correlation across loci per pair of samples, but instead of computing this across all the loci, we divide the genome into windows which respect chromosomal/scaffold boundaries. We use Euclidean distance which accomodates for continuous allele frequencies instead of genotype classes as in taxicab or Manhattan distance used in LD-kNNi. The adaptive behavior of our algorithm can be described in cases where the sparsity in the data is too high resulting to: | ||
- completely undefined correlation (LD) matrix, at which point we will use all the loci to compute distances between samples, and when | ||
- all k-nearest neighbours are missing at the locus which needs to be imputed, then we increase `k` until one of the neighbours has data to be used for weighted imputation of the missing allele. | ||
License: `use_gpl3_license()` | ||
Encoding: UTF-8 | ||
Roxygen: list(markdown = TRUE) | ||
RoxygenNote: 7.2.3 | ||
Config/rextendr/version: 0.3.1 |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,5 @@ | ||
# Generated by roxygen2: do not edit by hand | ||
|
||
export(aldknni) | ||
export(mvi) | ||
useDynLib(imputef, .registration = TRUE) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,17 @@ | ||
# Generated by extendr: Do not edit by hand | ||
|
||
# nolint start | ||
|
||
# | ||
# This file was created with the following call: | ||
# .Call("wrap__make_imputef_wrappers", use_symbols = TRUE, package_name = "imputef") | ||
|
||
#' @docType package | ||
#' @usage NULL | ||
#' @useDynLib imputef, .registration = TRUE | ||
NULL | ||
|
||
impute <- function(fname, imputation_method, min_coverage, min_allele_frequency, max_missingness_rate_per_locus, pool_sizes, min_depth_below_which_are_missing, max_depth_above_which_are_missing, frac_top_missing_pools, frac_top_missing_loci, window_size_bp, min_loci_per_window, min_loci_corr, max_pool_dist, optimise_for_thresholds, optimise_n_steps_corr, optimise_n_steps_dist, optimise_n_reps, n_threads, fname_out_prefix) .Call(wrap__impute, fname, imputation_method, min_coverage, min_allele_frequency, max_missingness_rate_per_locus, pool_sizes, min_depth_below_which_are_missing, max_depth_above_which_are_missing, frac_top_missing_pools, frac_top_missing_loci, window_size_bp, min_loci_per_window, min_loci_corr, max_pool_dist, optimise_for_thresholds, optimise_n_steps_corr, optimise_n_steps_dist, optimise_n_reps, n_threads, fname_out_prefix) | ||
|
||
|
||
# nolint end |
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,10 +1,10 @@ | ||
# imputepolypools | ||
# imputef | ||
|
||
Reduce genotype data sparsity through imputation of genotype classes or allele frequencies of individual polyploids or pools of individuals or populations. | ||
|
||
|**Build Status**|**License**| | ||
|:--------------:|:---------:| | ||
| <a href="https://github.com/jeffersonfparil/imputepolypools/actions"><img src="https://github.com/jeffersonfparil/imputepolypools/actions/workflows/r.yml/badge.svg"></a> | [![License: GPL v3](https://img.shields.io/badge/License-GPLv3-blue.svg)](https://www.gnu.org/licenses/gpl-3.0) | | ||
| <a href="https://github.com/jeffersonfparil/imputef/actions"><img src="https://github.com/jeffersonfparil/imputef/actions/workflows/r.yml/badge.svg"></a> | [![License: GPL v3](https://img.shields.io/badge/License-GPLv3-blue.svg)](https://www.gnu.org/licenses/gpl-3.0) | | ||
|
||
## Manual installation and development tools | ||
|
||
|
@@ -13,9 +13,9 @@ Reduce genotype data sparsity through imputation of genotype classes or allele f | |
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh | ||
sh ./Miniconda3-latest-Linux-x86_64.sh | ||
# Download the repo | ||
git clone https://jeffersonfparil:<API_KEY>@github.com/jeffersonfparil/imputepolypools.git some_branch | ||
git clone https://jeffersonfparil:<API_KEY>@github.com/jeffersonfparil/imputef.git some_branch | ||
# Create the development environment | ||
conda env create -n rustenv --file imputepolypools/tests/rustenv.yml | ||
conda env create -n rustenv --file imputef/tests/rustenv.yml | ||
conda activate compare_genomes | ||
``` | ||
|
||
|
@@ -24,16 +24,16 @@ conda activate compare_genomes | |
```R | ||
usethis::use_git_config(user.name="USERNAME", user.email="[email protected]") | ||
credentials::set_github_pat() ### Enter access token | ||
remotes::install_github("jeffersonfparil/imputepolypools") | ||
remotes::install_github("jeffersonfparil/imputef") | ||
``` | ||
|
||
## Usage | ||
|
||
```R | ||
?imputepolypools::mvi | ||
?imputepolypools::aldknni | ||
imputepolypools::mvi(fname="tests/test.vcf") | ||
imputepolypools::aldknni(fname="tests/test.vcf") | ||
?imputef::mvi | ||
?imputef::aldknni | ||
imputef::mvi(fname="tests/test.vcf") | ||
imputef::aldknni(fname="tests/test.vcf") | ||
``` | ||
|
||
### Functions | ||
|
@@ -168,71 +168,71 @@ This is used for genotype classes, i.e., binned allele frequencies: $g = {{1 \ov | |
|
||
### Autotetraploid (Lucerne) mean absolute error | ||
|
||
![mae_barplots](./res/eval/lucerne-Mean_absolute_error.svg) | ||
![mae_barplots](./res/lucerne-Mean_absolute_error.svg) | ||
|
||
### Pool (Soybean pools) mean absolute error | ||
|
||
![mae_barplots](./res/eval/soybean-Mean_absolute_error.svg) | ||
![mae_barplots](./res/soybean-Mean_absolute_error.svg) | ||
|
||
### Diploid (Zucchini) mean absolute error | ||
|
||
![mae_barplots](./res/eval/zucchini-Mean_absolute_error.svg) | ||
![mae_barplots](./res/zucchini-Mean_absolute_error.svg) | ||
|
||
### Diploid (Apple) mean absolute error | ||
|
||
![mae_barplots](./res/eval/apple-Mean_absolute_error.svg) | ||
![mae_barplots](./res/apple-Mean_absolute_error.svg) | ||
|
||
### Diploid (Grape) mean absolute error | ||
|
||
![mae_barplots](./res/eval/grape-Mean_absolute_error.svg) | ||
![mae_barplots](./res/grape-Mean_absolute_error.svg) | ||
|
||
------------------------------------------------------------------------------------ | ||
------------------------------------------------------------------------------------ | ||
------------------------------------------------------------------------------------ | ||
|
||
### Autotetraploid (Lucerne) concordance of observed and imputed genotype classes | ||
|
||
![concordance_genotype_classes_barplots](./res/eval/lucerne-Concordance.svg) | ||
![concordance_genotype_classes_barplots](./res/lucerne-Concordance.svg) | ||
|
||
### Pool (Soybean pools) concordance of observed and imputed genotype classes | ||
|
||
![concordance_genotype_classes_barplots](./res/eval/soybean-Concordance.svg) | ||
![concordance_genotype_classes_barplots](./res/soybean-Concordance.svg) | ||
|
||
### Diploid (Zucchini) concordance of observed and imputed genotype classes | ||
|
||
![concordance_genotype_classes_barplots](./res/eval/zucchini-Concordance.svg) | ||
![concordance_genotype_classes_barplots](./res/zucchini-Concordance.svg) | ||
|
||
### Diploid (Apple) concordance of observed and imputed genotype classes | ||
|
||
![concordance_genotype_classes_barplots](./res/eval/apple-Concordance.svg) | ||
![concordance_genotype_classes_barplots](./res/apple-Concordance.svg) | ||
|
||
### Diploid (Grape) concordance of observed and imputed genotype classes | ||
|
||
![concordance_genotype_classes_barplots](./res/eval/grape-Concordance.svg) | ||
![concordance_genotype_classes_barplots](./res/grape-Concordance.svg) | ||
|
||
------------------------------------------------------------------------------------ | ||
------------------------------------------------------------------------------------ | ||
------------------------------------------------------------------------------------ | ||
|
||
### Autotetraploid (Lucerne) coefficient of determination | ||
|
||
![r2_barplots](./res/eval/lucerne-Coefficient_of_determination.svg) | ||
![r2_barplots](./res/lucerne-Coefficient_of_determination.svg) | ||
|
||
### Pool (Soybean pools) coefficient of determination | ||
|
||
![r2_barplots](./res/eval/soybean-Coefficient_of_determination.svg) | ||
![r2_barplots](./res/soybean-Coefficient_of_determination.svg) | ||
|
||
### Diploid (Zucchini) coefficient of determination | ||
|
||
![r2_barplots](./res/eval/zucchini-Coefficient_of_determination.svg) | ||
![r2_barplots](./res/zucchini-Coefficient_of_determination.svg) | ||
|
||
### Diploid (Apple) coefficient of determination | ||
|
||
![r2_barplots](./res/eval/apple-Coefficient_of_determination.svg) | ||
![r2_barplots](./res/apple-Coefficient_of_determination.svg) | ||
|
||
### Diploid (Grape) coefficient of determination | ||
|
||
![r2_barplots](./res/eval/grape-Coefficient_of_determination.svg) | ||
![r2_barplots](./res/grape-Coefficient_of_determination.svg) | ||
|
||
|
||
|
||
|
Oops, something went wrong.