PAPI

PAPI is a tool for inferring the admixture proportions and admixture times for the parents of an admixed sample given unphased local ancestry tracts.

Installation

We recommend using anaconda to create a virtual environment using the included file papi_spec-file.txt.

conda create --name papi --file papi_spec-file.txt

Usage

Clone and navigate to the papi directory

cd src
source activate papi
usage: inference.py [-h] --inputfile INPUTFILE --ind IND [--tracefile [TRACEFILE]] --outfile OUTFILE [--mode [MODE]] [--typ [TYP]] [--err]

optional arguments:
  -h, --help            show this help message and exit
  --inputfile INPUTFILE, -i INPUTFILE
                        input tracts file
  --ind IND, -ind IND   individual on which to run papi
  --tracefile [TRACEFILE], -t [TRACEFILE]
                        optional argument, used to store trace output when run in mcmc mode. If not provided, MCMC solver will find MAP estimate
  --outfile OUTFILE, -o OUTFILE
                        required default output file
  --mode [MODE], -m [MODE]
                        inference mode-'pymc' or 'scipy-optimize'
  --typ [TYP], -typ [TYP]
                        model to use-'bino','hmm', or 'full'
  --err, -err

Input

An example input tracts file is provided in examples/tracts.txt that has the following structure

[[('10', 0.004568105), ('00', 0.46384804), ('10', 42.318381695), ('00', 27.1541), ... ], ...]
[[('10', 33.363797840000004), ('11', 18.6777), ('10', 8.2969), ('11', 14.64119999999999), ... ], ...]
...
...
...

Each line represents the tracts of an individual as a nested list of lists; each nested list corresponds to a chromosome. The first element of each tuple e.g ('10', 0.004568105) represents a single tract that is, in this case, heterozygous for the two ancestry states 1 and 0, while the second element represents the length of the tract in centiMorgans.

Example

Using the included tracts file, the simplest way of running PAPI is as follows:

python src/inference.py -i examples/tracts.txt -ind 1 -o test -m scipy-optimize -typ full

which will output GD estimates for the first line of the tracts file in test.scipy.map corresponding the the first individual. The parameters under which these tracts were simulated can be found in the corresponding headers.txt file. Note that the hyperparameter tau can be optionally specifed with -tau - it is set by default to 7 if unspecified.

Output

The output is in the form of a text file with a single line that looks like this

5.899999999999999689e-01 4.199999999999999845e-01 1.000000000000000000e+00 7.000000000000000000e+00

The first two floats are the admixture proportion estimates, while the latter two correpsond to admixture time estimates.

Name		Name	Last commit message	Last commit date
Latest commit History 38 Commits
examples		examples
scripts		scripts
src		src
.DS_Store		.DS_Store
LICENSE		LICENSE
README.md		README.md
papi.yml		papi.yml
papi_spec-file.txt		papi_spec-file.txt
sftp-config.json		sftp-config.json
test.scipy.map		test.scipy.map

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PAPI

Installation

Usage

Input

Example

Output

About

Releases

Packages

Languages

License

williamslab/papi

Folders and files

Latest commit

History

Repository files navigation

PAPI

Installation

Usage

Input

Example

Output

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages