-
Notifications
You must be signed in to change notification settings - Fork 2
/
Copy pathoutline.tex
245 lines (159 loc) · 14.8 KB
/
outline.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
\documentclass[10pt, letterpaper]{article}
\usepackage{comment}
\pagestyle{plain}
\usepackage[text={6in,8.5in},centering]{geometry}
%PUT YOUR MACROS HERE
\usepackage{change page} % for indentation of sections
\usepackage{lipsum}
\usepackage{hyperref} %for urls
\usepackage{graphicx} %for includegraphics
\usepackage[table]{xcolor} %for table line color alteration
\usepackage{enumitem} %fancy enumerated lists
\usepackage{wrapfig} %wrap text around figures
\usepackage[leftcaption]{sidecap} % side captions
\sidecaptionvpos{figure}{c} %position side caption
\usepackage[colorinlistoftodos]{todonotes} % comments in margins
\reversemarginpar %comments on left
\setlength{\marginparwidth}{2.5cm} %width of comments
\usepackage[round,authoryear]{natbib} %biblio format!
%%%%%%%%%% EXACT 1in MARGINS %%%%%%% %%
%\setlength{\textwidth}{6.5in} %% %%
%\setlength{\oddsidemargin}{0in} %% (It is recommended that you %%
%\setlength{\evensidemargin}{0in} %% not change these parameters, %%
%\setlength{\textheight}{8.5in} %% at the risk of having your %%
%\setlength{\topmargin}{0in} %% proposal dismissed on the basis%%
%\setlength{\headheight}{0in} %% of incorrect formatting!!!) %%
%\setlength{\headsep}{0in} %% %%
%\setlength{\footskip}{.5in} %% %%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% %%
\newcommand{\required}[1]{\section*{\hfil #1\hfil}} %%
\renewcommand{\refname}{\hfil References Cited\hfil} %%
\bibliographystyle{abbrvnat} %%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\newcommand{\jri}[1]{\todo[size=\scriptsize, color=red]{#1}}
\newcommand{\kjg}[1]{\todo[size=\scriptsize, color=blue]{#1}}
\title{NSF PRFB - Plant Genome}
\author{Kimberly J. Gilbert}
\date{October 2015}
\begin{document}
\section{Summary}
How does the genetic architecture of quantitative traits change as a result of demography and selection during the maize domestication bottleneck and the further bottleneck that some maize populations underwent from Central America to South America?
If we compare maize and teosinte (which are locally adapted in their respective pops), do we see evidence of:\\
\indent - many genes of small effect or few genes of large effect underlying important traits?\\
\indent - do these differences match our predictions based on their differing demographic histories?
Broadly relevant because these results could inform on maintaining diversity in crops for the future in ways that may have much larger long term impacts in the maintenance of diversity as well as fitness/adaptation in the face of climate change and future adaptation to changing environmental conditions.
\section*{Intro}
genetic architecture underlying traits affects how easy/hard and quick/slow local adaptation can occur. important for breeding, conservation, predicting response to climate change, etc.
useful info for crops/domesticated species (and also things like disease in humans?)
selection interacts with demography. $N_es>1$ for selection to win.
as demography changes, $N_es$ changes.
so demography is known to directly affects the strength of selection, but what is less studied is how this may restructure the genome in terms of how evolution proceeds for adaptation to new or changing conditions.
Different demographic histories such as bottlenecks, repeated founder effects during expansion, or rapid population growth can have effects that interact with selection.
Small population sizes can lead to purging (recessives become homozygous and removed in smaller pops), but also increase in deleterious alleles (surfing phenomenon due to random genetic drift).
mutations that affect a trait related to fitness will then be impacted by demography x selection interaction.
in annual plants, nearly ALL traits related to fitness
we thus predict that genetic architecture (number and size of mutations) should be impacted by demographic change.
this is not understood well in any system. controversial in humans (lohmueller vs. pritchard etc.). in plants, summaries are on gross overall $N_e$ (small vs. big) and ignore recent demographic change.
\section*{Proposed Research}
The maize/teosinte system, has a well described demographic history, supported by archaeological and other human records of its use and domestication in the Americas.
Maize was domesticated $\approx$9000 years ago in SW Mexico which bottlenecked populations to a small size, but was subsequently followed by a large population expansion more recently. This demographic event can be detected genetically, as shown by Beissinger in prep (github repo), where populations were bottlenecked to 5\% of their previous size followed by spread across central and southern America and into highland and lowland environments. Populations of maize have since recovered to much larger effective population sizes than even before the domestication bottleneck (humongous growth to at least 300K but maybe as much as 1E9, citation).
Being a crop species that is incredibly useful to humans (lots of citations and examples of usage), there is also a wealth of knowledge on quantitative traits.
will need to explain rare alleles pops and experiment some (PGRP15 grant)
%using genomic data from both maize and teosinte across their species' ranges, we can compare
% classes of functional diversity, using e.g. GERP scores, across pops of different demographic history
% look at the distribution of these as well as in beneficial traits if possible
%can compare these results to simulations parameterized to match known demog. of teosinte/maize
This system thus serves as a great resource to compare the effects of this history on the genome in maize versus its extant, wild ancestor teosinte, particularly to investigate how the genetic architecture of these important traits may have evolved differently as a result of this combination of demography and selection.
\section*{Research Objectives}
\begin{enumerate}
\item estimate DFE, test in\//against teosinte:
\begin{enumerate}
\item DFE estimation via DoFE
\item translate DFE into the distribution of effect sizes
\item simulate equilibrium scenarios in fwdpy for variety of parameters: Vg, Vd, etc.
\item test against observed traits in teosinte: compare effect size distribution, do GWAS on the simulated data and see if matches the real data
\item if the sims don't match the teosinte data, tweak params until do
\item data = 5000 individuals from crosses of 70 teosinte inds - genotyped and phenotyped for 16 traits
\end{enumerate}
\item simulate maize domestication then compare results to real maize data:
\begin{enumerate}
\item use fwdpy
\item fit the distribution of effect sizes from (1) using an ABC approach within fwdpy
\item gives posterior dist. of most likely params - Vg, Vd, Va ...
\item sample from these posteriors to parameterize and run the maize simulations
\item maize sims will have the domestication bottleneck on all traits, then some traits known to be under seln will additionally have positive seln imposed on them
% add domestication bottleneck, variety of selection regimes for different Vg, etc.
\item ask how\//do traits with high heritability change
\item compare to actual traits in maize as a validation
\item if there is a difference, then we can attribute this to seln params being different and can estimate seln w/ ABC on maize
\item some traits we expect to be under seln and others not so can see if our results in terms of which traits under seln match expectations too, and if not, then something wonky with the sims
\item if our sims and earlier estimations were correct, we should be able to recover the repective distributions of fitness effects in maize and teosinte when looking in the pops of mixed background inds:
\item \small{(Zea synthetic is the teosinte synthetic further mixed with 26 maize lines, and is ~12\% teosinte (from 11 different founders), 40\% B73 (the reference genome line), and ~2\% each of 25 other maize inbred lines. All founders have been completely sequenced.)}
\end{enumerate}
\item simulate additional populations of domestic maize that have various demographies to see those impacts:
\begin{enumerate}
\item impose the different demographies from what is known to have occurred as maize expanded from Mexico
\item not all the demogs are known for all pops, Takuno just highland S Amer? so may need to estimate others, or could propose just imposing a variety of demogs
\item further expansions, bottlenecks..., migration into SW US
\item these will come from the same starting point data as the start of objective 2, i.e. start the maize domestication and send it under a different trajectory (but all would still include the first C Mexico bottleneck)
\item but with the diff. demogs and seln regimes, will then compare how the distribution of effect sizes has changed
\item makes predictions about how we expect varying demog and seln to have an impact on gen .architecture.
\end{enumerate}
\end{enumerate}
\section*{Old Objectives below}
\subsection{Obj 1 - Estimate the Distribution of Fitness Effects (DFE)}
using teosinte genomes \kjg{note to self, update the data description}\\
- from polymorphism/divergence data. use HapMap2 or current teosinte genomes (I think I'd vote for latter)
\url{http://goo.gl/CLmsmX} and \url{http://www.genetics.org/content/177/4/2251.short})
can either use estimated demographic model or use noncoding sites to normalize SFS
Methods: estimate the DFE using Eyre-Walker's DoFE \url{http://www.lifesci.susx.ac.uk/home/Adam_Eyre-Walker/Website/Software.html}
\kjg{this part prob not in proposal:}
can validate by comparing to
GERP distribution
partitioning variance components, e.g. \url{http://www.ncbi.nlm.nih.gov/pubmed/25439723}
GREML software
% \kjg{this gets at magnitude of effect, but also want to get at how many loci may be contributing to any given trait?}
% \jri{nope. we just need DFE. see obj 2}
useful because the distribution of mutation effect sizes is not generally known, and is especially difficult for small effect mutations.
objective 1 will inform perhaps what the DFE may look like in any organism with a history similar to teosinte?
and just in general add to the body of literature on genetic architecture, mutation effects
\subsection{Obj 2 - Simulate scenarios of different traits.}
from objective 1 we now know the DFE of teosinte
we already know the demographic history of maize since its split from teosinte (citations)
use the DFE results to parameterize a model that will simulate the evolution of maize and its genetic architecture through time during and since its domestication
we can simulate maize that expanded into S America separately since it has a different demography and then compare any differences the regions may show in the end
simulate traits w/ varying correlation with fitness
new mutation effects on fitness determined by DFE, effect on trait by correlation between trait and fitness
Methods: fwdpy (python version of fwdpop, cite \url{http://www.genetics.org/content/198/1/157.abstract}), allows simultaneous generation of complex demography, natural selection, and quantitative phenotypes, including deleterious mutations and their effect on phenotype
evaluate:
how many loci contribute to important traits?
how strong are these effects?
how do details of demography impact outcome?
test against theory e.g \url{http://arxiv.org/abs/1312.3028}
standing questions:
does the DFE significantly change in a meaningful way or a certain direction?
mean value the same but narrower or wider distribution?
skewed more one way or the other?
might expect this to be a different answer for maize in S America vs Central since S America has had a second bottleneck, so more founder effects and more chance for drift
(could also do some broader simulated examples just to stand alone and see if other various outcomes may occur - just don't plan on comparing these to any real data)\jri{yes, once we can show we can recapitulate real data, i think this is useful}
\subsection{Obj 3 - compare simulation results to modern maize genomes, known to have undergone the same demographies simulated in objective 2}%
compare to GWAS for maize/teo. do we recapitulate observations?
several traits are genotyped/phenotyped in maize/teosinte
If the estimated demographic model and DFE are reasonable, the genetic architecture of simulated phenotypes should closely mimic that of real data.
are there differences between central and southern American pops? \jri{we have no GWAS data for S.Amer. pops, but do have genomes and GERP. we could get freq. etc. of del. mutations from sims and compare to GERP}
Methods: same DoFE approach in C American maize pops for direct comparison
in S American pops, can do comparison on subset, e.g. GERP, which we would have from Obj 1 if we compare to other approaches for the sort of validation of the DFE
(definitely worth doing if b/c of 2nd bottleneck more deleterious stuff rose in frequency and was then eliminated)
% imp. point from Jeff's service.tex file
If the real data differ from simulation, we can explore the sensitivity of genetic architecture to changes to the demography or DFE; understanding this sensitivity will then lead to improved estimation of these important parameters.
\subsection*{Big picture thoughts}
it is thought, and shown in some human pops, that demog. history such as expansion leads to an increase in delet alleles, and of larger effects - b/c of continued inferred expansions and bottlenecks.
is there any evidence of this in maize?
Quantitative phenotypes such as yield, plant height, and flowering time are of critical importance to agriculture.
Deleterious alleles likely play a large role in many of these phenotypes: crop plants have undergone dramatic demographic shifts, usually involving a domestication bottleneck followed by expansion as cultivation spread, and some authors even argue that selection on domestication traits has inadvertently increased the frequency of alleles deleterious for other phenotypes (cite gunther2010).
Consistent with this hypothesis, my lab has recently shown that genes associated with a number of quantitative traits in maize are enriched for deleterious alleles compared to randomly chosen genes (cite mezmouk2014).
However, while we know that demography impacts the frequency of individual deleterious variants, we have a poor understanding of the interaction of demography and selection on phenotypic variation.
In particular, we know little about how these two forces interact to determine the genetic architecture -- the number of genes and their effect -- of a trait.
Such information is crucial for understanding variation in phenotype, designing breeding strategies, utilizing diversity from wild relatives, or even engineering new traits using biotechnology.
\jri{see ideas and text in ``service\_award.tex'' that I uploaded too.}
\end{document}