Skip to content
This repository has been archived by the owner on Jan 16, 2019. It is now read-only.
Skylar Wyant edited this page May 20, 2016 · 12 revisions

This method estimates Thetas values and other neutrality statistics. Please see ANGSD for full details on this method.


Note:

This method requires output from the site frequency spectrum to run

Please calculate a site frequency spectrum before estimating Thetas


Basic Usage

To run this method, use the following command

angsd-wrapper Thetas Thetas_Config

where Thetas_Config is the full path to the configuration file for the site frequency spectrum.

Input files

All inputs should be specified in Thetas_Config.

Common Variables

This method does make use of Common_Config, those that are used are listed below:

Variable Function
SAMPLE_LIST
GROUP_SAMPLES on dev
A list of samples to be used in calculations
SAMPLE_INBREEDING
GROUP_INBREEDING on dev
A list of inbreeding coefficients, where each line here corresponds to a line in SAMPLE_LIST or GROUP_SAMPLES on dev
ANC_SEQ Path to ancestral sequence
PROJECT Name given to all outputs in ANGSD-wrapper
SCRATCH Place to store files, the full path is SCRATCH/PROJECT/Thetas
REGIONS Limit the scope of ANGSD-wrapper to certain regions

Method-Specific Variables

This variable is specific to this method:

Variable Function
PEST The site frequency spectrum file, this should be the DerivedSFS file

Method Parameters

The parameters for this method can be tweaked as necessary, they have been set for optimal generalized function:

Parameter Function
DO_SAF Creates a site frequency spectrum
UNIQUE_ONLY Use uniquely mapped reads only
MIN_BASEQUAL Minimum base quality score
BAQ Adjust Q scores around indels
MIN_IND1 Minimum number of individuals needed to use this site
GT_LIKELIHOOD Estimates genotype likelihoods
MIN_MAPQ Minimum base mapping quality
N_CORES Number of cores to use, please do not set above the limits of your system
DO_MAJORMINOR Estimate major/minor alleles
DO_MAF Calculate per-site frequencies
OVERRIDE If true, will recalculate files that already exist
SLIDING_WINDOW Enable sliding window analysis
WIN Window size for sliding window analysis
STEP Step size for sliding window analysis

Output files

Naming Scheme Contents
PROJECT_Diversity.arg Details of arguments
PROJECT_Diversity.mafs.gz Minor allele frequencies
PROJECT_Diversity.thetas.gz Diversity statistics
PROJECT_Diversity.thetas.gz.bin Binary index of diversity statistics
PROJECT_Diversity.thetas.gz.idx Index of diversity statistics
PROJECT_Diversity.thetas.gz.pestPG Final Thetas estimations

Visualization

PROJECT.thetas.graph.me can be visualized with the Shiny graphing interface. A web browser with a graphical user interface is required.