model.tex


\section{Radial Velocity Models in the Search for Exoplanets}
\label{sec:exo}

In a two-body system such as a star and planet, the pair rotate
together about a point lying somewhere on the line connecting their
centers of mass. If one of the bodies (the star) radiates light, the
frequency of this light measured by a distant observer will
 vary  cyclically with a period equal to the orbital period. This
Doppler effect is understood well enough that astronomers can
translate between the frequency shift and the star's velocity toward or
away from earth providing indirect measurements of a star's velocity.
The radial velocity (RV)  measurement of a star is the component of velocity
along the line of sight from the earth to the star.
In additional to calculated radial velocities of a star at
time $t_i$, $v_i$, an associated estimate of uncertainty $\s_i$ is
reported which accounts for the propogation of errors introduced
by the telescope and measurement process  \uhoh{give REF}.

If a star does not host any orbiting planets, then the radial velocity
measurements $v_i$ will be roughly constant over any period of time,
varying only due to the ``stellar jitter'', $s^2$, the random
fluctuations in a star's luminosity due to eddies, upwellings and downwellings
on the surface of the star. Under the zero-planet
model, $\M_0$, the RV measurements are assumed to  have Gaussian distributions
\begin{equation}\label{0p_Model}
v_i \mid \M_0\ind \mathcal{N}\left(\C_0,\s^2_i + \sj_0^2 \right).
\end{equation}
with mean $\C_0$, the constant center-of-mass velocity of the star
relative to the earth, and variances $\sj^2_0 + \s^2_i$.  The
parameters $\C_0$ and $\sj_0$ are both in the same units as the RV
measurments. \uhoh{more  justification for additive form of the variances, ref?}.

If the star  does host planets, the gravitational pull of the orbiting
planets induces a small wobble in the observed radial velocities,
which may be modelled using Keplar's laws.
For a single planet model, $\M_1$, the RV measurements are also
assumed to be Gaussian distributed  with
\begin{equation}\label{Velocity_Model}
v_i \mid \M_1 \ind \N \left(\C_1 +\bigtriangleup
V(t_i|\phi_1),\s_i^2+\sj_1^2\right),
\end{equation}
where the velocity shift $\bigtriangleup
V(t_i|\phi_1)$ due to the presence of a single planet is
a  family of curves parametrized by 
$\phi_1 \equiv (\K_1,\P_1,\e_1,\omega_1,\mu_1)$
\begin{equation}\label{Velocity_1p_Model}
\bigtriangleup V(t|\phi_1)=\K_1[\cos(\omega_1+T(t))+\e_1 \cos(\omega_1)]
\end{equation}
where $T(t)$ is the ``true anomaly at time $t$'' given by
\begin{equation}\label{true_anomaly}
T(t)=2\arctan\left[\tan\left(\frac{E(t)}{2}\right)\sqrt{\frac{1+\e_1}{1-\e_1}}\right].
\end{equation}
and $E(t)$ is called the ``eccentric anomaly at time $t$'', which is the
solution to the transcendental equation
\begin{equation}\label{transcendental_equation}
E(t)-\e_1\sin(E(t))=\mbox{mod}\left(\frac{2\pi}{\P_1}t+\mu_1,2\pi\right).
\end{equation}
The five orbital parameters that comprise $\phi_1$ are the velocity
semi-amplitude $\K_1$, the orbital period $\P_1$, the eccentricity
$\e_1$, $(0\leq \e_1 \le 1)$, the argument of periastron $\omega_1$,
$(0\le \omega_1 \le 2\pi)$ and the mean anomaly at time $t=0$,
$\mu_1$, $(0\le \mu_1 \le 2\pi)$.  The parameters $\C_1$ $\K_1$ and
$\sj_1$ have units $m/s$; the velocity semi-amplitude $\K_1$ is
usually restricted to be non-negative to avoid identification
problems, while the velocity offset $\C_1$ may be positive or
negative.  The eccentricity parameter $\e_1$ is unit-less, with $\e_1
= 0$ corresponding to a circular orbit, and larger $\e_1$ leading to
more eccentric orbits. Periastron is the point at which the planet is
closest to the star and the argument of periastron $\omega_1$,
measures the angle at which we observe the elliptical orbit.  The mean
anomaly $\mu_1$ is an angular distance of a planet from periastron.
\uhoh{Add figure illustrating?}

If there are $P \ge 1$ planets, the expected velocity is $\C_p
+ \bigtriangleup V(t_i|\phi_1,\ldots,\phi_p)$ with the overall velocity shift
$\bigtriangleup V$  approximated as the sum of the velocity
shifts of the individual planets:
\begin{equation}\label{Velocity_2p_Model}
\bigtriangleup V(t_i|\phi_1,\ldots,\phi_p)=\sum_{j=1}^p
\K_j[\cos(\omega_j+T_i(t_i))+\e_j\cos(\omega_j)]
\end{equation}
where the planets' mutual gravitational interactions are assumed to be
negligible.  With $p$ planets, there are a total of $2+5p$ parameters,
denoted as 
$\theta_p = \{ \mathcal{C}_p,\sj_p^2,\phi_1,\ldots,\phi_p\}$ for each of
the models $\M_p$, $p \in \{0, 1, \ldots P\}$. Of course, we do not
know how many planets there are \emph{a priori} -- indeed, finding the
number of planets $p$ and characterizing their orbital parameters is
a major aim.

\subsection{Bayesian Methods for Identifying the Number of Planets}
Determining the number of planets in a
system is, from a statistical point of view, a model choice
problem. Bayesian model selection requires calculation of marginal
likelihoods of models or ``evidence'' provided by the data for each model:
\begin{equation}\label{marginal_lik}
m( \M_p) \propto p(\v \mid \M_p) = \int_{\Theta_p}
p(\v \mid \theta_p,\M_p) p(\theta_p \mid \M_p) d\theta_p
\end{equation} 
which entails integrating the sampling model of the
data $\v = (v_1, \ldots v_n)^T$ with respect to the prior distribution
of model specific parameters $\theta_p$ to obtain the marginal
density $p(\v \mid \M_p)$ of the data under model $\M_p$.
Bayes Factors for comparing a $p$ planet model to the $0$ planet model
may be expressed  as
\begin{equation}
\BF(\M_p:\M_0)=\frac{m( \v \mid \M_p)}{m(\v \mid \M_0)}
\end{equation}
where the Bayes factor $\BF(\M_0,\M_0) = 1$,
while the posterior probability of the $p$ planet model is of the
form
\begin{equation}
  \label{eq:post-prob}
  p(\M_p \mid \v) = \frac{\BF(\M_p:\M_o) O(\M_p: \M_0)} 
{ \sum_{ j= 1}^{P} \BF(\M_j:\M_o) O(\M_j: \M_0)}
\end{equation}
where $O(\M_p: \M_0)$ is the prior odds of $p$ planets to $0$
planets and $P$ is the maximum number of planets for the
system.  This requires specifying a prior distribution on $\theta_p$
for each of the models in order to obtain marginal likelihoods and
Bayes factors.

\subsection{Priors Distributions}
Many of the parameters in $\theta_p$ allow informative marginal prior
distributions to be specified, although specifying joint distributions
is more difficult.  While distributions may be tuned for specific
applications, \citeauthor{ford2006bms} and \citep{SAMSI:2006}
recommended the following independent ``reference'' priors as a starting
point for Bayesian analyses of radial velocity models. 
In all models, the velocity offset (intercept) parameter
$\C_p$ and stellar jitter parameter $\sj_p$, are taken as
being {\it a priori} independent, where $\C_p$ is uniform over a finite
set $[\C_{\min}, \C_{\max} ] $
\begin{subequations}
\begin{align}
\label{prior_C}
p_C(C) & =\left\{\begin{aligned}
\frac{1}{\C_{\max}-\C_{\min}} &~ \qquad\qquad\qquad\mbox{for } ~\C_{\min}\leq C\leq \C_{\max}\\
0\quad\quad &~ \qquad\qquad\qquad\mbox{otherwise}  \\
\end{aligned}
\right. 
\intertext{and  $\log(\sj_{\min} + \sj_p)$ has a   uniform distribution on
the interval $(\log(\sj_{\min}), \log(\sj_{\max})]$, }
\label{prior_sigma}
p_{\sj}(\sj) &  = \left\{\begin{aligned}
\frac{1}{ \log \left(1+\frac{\sj_{\max}}{\sj_{\min}}\right)} \cdot
\frac{1}{\sj_{\min}+ \sj}
&~ 
\quad\quad\mbox{for } ~0 < \sj \leq \sj_{\max} \\
0 \quad\quad &~ \quad\quad\mbox{otherwise.}  \\
\end{aligned} \right.
\end{align}
\end{subequations}
The joint prior on $(\C_p, \sj_p)$ may be viewed as  a
modified independent Jeffrey's prior  as $\C_{\min} \to -\infty$,
$\C_{\max} \to \infty$, $\sj_{\min} \to 0$, $\sj_{\max} \to \infty$, which leads
to well defined posterior distributions and Bayes Factors even in the limit. 
However, informative upper limits on the stellar jitter are
often available, which may aid in the rejection of models that are too
simple to capture the observed periodicities in the data.   This is a
particular advantage of the Bayesian approach over frequentist methods.


For components of $\phi_p$, the recommended priors are  $\e_p \sim \U(0,1)$, $\omega_p \sim \U(0, 2
\pi)$ and $\mu_p \sim \U(0, 2 \pi)$, which take into account the known
marginal constraints.  For period, they recommend that log period have
a uniform distribution over the range $1$ day to $1,000$ years.  The
lower limit corresponds to the smallest oribital period of known
exoplanets (which is lightly larger than the theoretical limit for
certain stars), while the upper limit exceeds the period of known
exoplanets, while still within the limits where the planet's orbit would not be
disrupted by outside perturbations to the system.  Our prior for $\K_p$
is based on the SAMSI Exoplanet Working Group recommendation, where
$\K_p + \K_{\min} \sim  \U(\K_{\min}, \K_{\min} + \K_{\max})$ and 
$\K_{\max} = 2128$ corresponds to a maximum 
planet-star mass ratio of 0.01;  the lower bound corresponds to the
current minimum detection bounds ($1$ m/s); the prior in
\citet{ford2006bms} is of the same form, although they allow the upper
bound to depend on period.  Table \ref{tab:hyper} summarizes the
constants used in the prior distributions.
\begin{table}[h]
  \begin{center}
  \begin{tabular} {|ll|lr|lr|} \hline \hline
\multicolumn{2}{|c}{Model Parameter} &
\multicolumn{2}{|c|}{Hyperparameters}  \\ \hline
Period & $\P_p$ & $\P_{\min}$ &  $1$ day & $\P_{\max}$ &  $1,000$ years \\
Velocity Semi-Amplitude &$\K_p$ & $\K_{\min}$ &  $ 1$ m/s &  $\K_{\max}$ &  $2128$ m/s \\
Velocity Offset & $\C_p$ & $\C_{min}$ &  $-2128$ m/s & $\C_{max}$ &   $2128$ m/s \\
Steller Jitter & $\sj_p$  &$\sj_{\min}$ &  $1$ m/s & $\sj_{\max}$ &  $100$ m/s \\ \hline
  \end{tabular}
  \end{center}
\caption{Hyperparameters for the the refernce prior distributions of
  $\phi_p$ from \citet{ford2006bms}.}
\label{tab:hyper}
\end{table} 


The nonlinear relationships induce strong correlations among
many of the parameters, which leads to a difficulties for posterior
inference.  For circular orbits ($e = 0$), $\omega$ and $\mu_0$ are in
fact unidentifiable.   To simplify posterior inference using
stochastic sampling, we work in a transformed parameter space:
$$
\begin{aligned}
  x_p & = \e_p\cos\omega_p  \label{eq:poincare-x} & \quad &   y_p & =
  \e_p\sin\omega_p \label{eq:poincare-y} & \quad & z_p & = (\omega_p+\mu_p)
  \mod 2\pi \\
  \dot{\P}_p & = \log \P_p  & \quad &  \dot{\K}_p & = \log \K &  \quad & 
\end{aligned}
$$
leading to $\dot{\phi}_p \equiv (\dot{\K}_p, \dot{\P}_p, x_p, y_p,
z_p)^T$.  The Poincar\'e variables $x_p$ and $y_p$ greatly reduce the
very strong correlations between $\mu_p$ and $\omega_p$, which is
particularly important for low eccentricity orbits, where the
parameters are nearly unidentifiable.  The use of $z_p$ further
reduces correlations between the parameters $\omega_p$ and $\mu_p$
when $\e_p \ll 1$, but has little effect for large $\e_p$.  
Posterior distributions for were more Gaussian in these coordinates, which led to
improved posterior simulation.  In the transformed parameter space,
the prior distribution for $\phi_p$ is
\begin{subequations}
\begin{equation}\label{prior_1p}
p_{\dot{\phi}_p}(\dot{\phi}_p)=c_\phi \cdot\exp\dot{\K}_p\cdot\frac{1}{1+\frac{\exp\dot{\K}_p}{\K_{\min}}}\cdot\frac{1}{\sqrt{x_p^2+y_p^2}}
\end{equation}
for $\log(\K_{\min}) < \dot{\K}_p \leq\log(\K_{\max})$,
$\log(\P_{\min})\leq\dot{P}\leq\log(\P_{\max})$, $x^2+y^2<1$, and $0\leq
z\leq2\pi$,
where the normalizing constant is 
\begin{equation}\label{kappa_in_prior_1p}
c_\phi
=\frac{1}{\log\left(1+\frac{\K_{max}}{\K_{\min}}\right)}\cdot\frac{1}{\K_{\min}}\cdot\frac{1}{\log\left(\frac{\P_{\max}}{\P_{\min}}\right)}\cdot\left(\frac{1}{2\pi}\right)^2. 
\end{equation}
\end{subequations}

For any of the $p > 1$ planet models, there is an inherent difficulty  due
to the arbitrary ``labeling'' of planets in the model.  To address this lack of
identification, we {\it a priori} restrict the periods so that
$\P_{\min} \le \P_1 \le P_2 \le \ldots \le  P_p \le \P_{\max}$.