Skip to content

Commit

Permalink
Metropolis Algorithm for MCMC (#1262)
Browse files Browse the repository at this point in the history
Added MCMC and Metropolis algorithm

Augmented Restart capability to allow for Adaptive Restart
  • Loading branch information
wangcj05 authored Jul 25, 2020
1 parent 80fa915 commit 2a94807
Show file tree
Hide file tree
Showing 20 changed files with 4,937 additions and 34 deletions.
49 changes: 49 additions & 0 deletions developer_tools/XSDSchemas/Samplers.xsd
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@
<xsd:element name="AdaptiveSobol" type="AdaptiveSobolSampler" minOccurs="0" maxOccurs="unbounded"/>
<xsd:element name="EnsembleForward" type="EnsembleForwardSampler" minOccurs="0" maxOccurs="unbounded"/>
<xsd:element name="AdaptiveMonteCarlo" type="AdaptiveMCSampler" minOccurs="0" maxOccurs="unbounded"/>
<xsd:element name="Metropolis" type="MetropolisSampler" minOccurs="0" maxOccurs="unbounded"/>

</xsd:sequence>
<xsd:attribute name="verbosity" type="verbosityAttr" default="all"/>
Expand Down Expand Up @@ -526,4 +527,52 @@
<xsd:attribute name="mode" type="modeAttr" default="post"/>
<xsd:attribute name="updateGrid" type="RavenBool" default="true"/>
</xsd:complexType>

<!-- *********************************************************************** -->
<!-- Markov Chain Monte Carlo -->
<!-- *********************************************************************** -->
<xsd:complexType name="metropolisInitType">
<xsd:all>
<xsd:element name="limit" type="xsd:string" minOccurs="0"/>
<xsd:element name="initialSeed" type="xsd:integer" minOccurs="0"/>
<xsd:element name="burnIn" type="xsd:string" minOccurs="0"/>
</xsd:all>
</xsd:complexType>

<xsd:complexType name="metropolisVariableType">
<xsd:complexContent>
<xsd:extension base="variableType">
<xsd:sequence>
<xsd:element name="initial" type="xsd:string" minOccurs="0"/>
<xsd:element name="proposal" type="AssemblerObjectType" minOccurs="1" maxOccurs="1"/>
</xsd:sequence>
</xsd:extension>
</xsd:complexContent>
</xsd:complexType>

<xsd:simpleType name="stringBaseType">
<xsd:restriction base="xsd:string"/>
</xsd:simpleType>

<xsd:complexType name="likelihoodType">
<xsd:simpleContent>
<xsd:extension base="stringBaseType">
<xsd:attribute name="log" type="RavenBool"/>
</xsd:extension>
</xsd:simpleContent>
</xsd:complexType>

<xsd:complexType name="MetropolisSampler">
<xsd:sequence>
<xsd:element name="samplerInit" type="metropolisInitType" minOccurs="1"/>
<xsd:element name="likelihood" type="likelihoodType" minOccurs="1" maxOccurs="1"/>
<xsd:element name="variable" type="metropolisVariableType" minOccurs="0" maxOccurs="unbounded"/>
<xsd:element name="TargetEvaluation" type="AssemblerObjectType" minOccurs="1" maxOccurs="1"/>
<xsd:element name="constant" type="constantVarType" minOccurs="0" maxOccurs='unbounded'/>
<xsd:element name="Restart" type="AssemblerObjectType" minOccurs="0" maxOccurs="unbounded"/>
<xsd:element name="restartTolerance" type="xsd:float" minOccurs="0" maxOccurs="1"/>
</xsd:sequence>
<xsd:attribute name="name" type="xsd:string" use="required"/>
<xsd:attribute name="verbosity" type="verbosityAttr" default="all"/>
</xsd:complexType>
</xsd:schema>
127 changes: 127 additions & 0 deletions doc/user_manual/sampler.tex
Original file line number Diff line number Diff line change
Expand Up @@ -3137,3 +3137,130 @@ \subsubsection{Adaptive Sobol Decomposition}
</Samplers>
\end{lstlisting}

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%% Markov Chain Monte Carlo %%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{Markov Chain Monte Carlo}
\label{subsec:MCMC}
The Markov chain Monte Carlo (MCMC) is a Sampler entity in the RAVEN framework.
It provides enormous scope for realistic statistical modeling. MCMC is essentially
Monte Carlo integration using Markov chain. Bayesians, and sometimes also frequentists,
need to integrate over possibly high-dimensional probability distributions to make inference
about model parameters or to make predictions. Bayesians need to integrate over the posterior
distributions of model parameters given the data, and frequentists may need to integrate
over the distribution of observables given parameter values. Monte Carlo integration draws
samples from the required distribution, and then forms samples averages to approximate expectations.
MCMC draws these samples by running a cleverly constructed Markov chain for a long time.
There are a large number of MCMC algorithms, and popular families include Gibbs sampling,
Metropolis-Hastings, slice sampling, Hamiltonian Monte Carlo, and many others. Regardless
of the algorithm, the goal in Bayesian inference is to maximize the unnormalized joint
posterior distribution and collect samples of the target distributions, which are marginal
posterior distributions, later to be used for inference.

\subsubsection{Metropolis (Metropolis-Hastings Sampler)}
\label{subsubsubsec:metropolis}
The Metropolis-Hastings (MH) algorithm is a MCMC method for obtaining a sequence of random samples from a probability
distribution from which direct sampling is difficult. This sequence can be used to approximate the distribution or
to compute an integral. It simulates from a probability distribution by making
use of the full joint density function and (independent) proposal distributions for each of
the variables of interest.

\specBlock{an}{Metropolis}
%
\attrIntro

\begin{itemize}
\itemsep0em
\item \nameDescription
\end{itemize}

\variableIntro{Metropolis}

\begin{itemize}
\item \variableDescription
\variableChildrenIntro
\begin{itemize}
\item \distributionDescription
\item \functionDescription
\item \xmlNode{initial}, \xmlDesc{float, optional field}, specified the initial value for given variable.
\item \xmlNode{proposal}, \xmlDesc{Assembler Object}, specifies the proposal distribution for this variable.
This node must contain the following two attributes:
\begin{itemize}
\item \xmlAttr{class}, \xmlDesc{required string attribute}, the main
``class'' of the listed object. Only ``Distributions'' is allowed.
\item \xmlAttr{type}, \xmlDesc{required string attribute}, the object
identifier or sub-type.
\end{itemize}
\end{itemize}
\nb For MCMC sampler, we only allow "continuous" distributions as input to \xmlNode{variable}.
\item \constantVariablesDescription
\end{itemize}


In the \textbf{Metropolis} input block, the user needs to specify the variables need to be sampled.
As already mentioned, these variables are inputted within consecutive xml blocks called \xmlNode{variable}.
In addition, the settings for this sampler need to be specified in the \xmlNode{samplerInit} XML block:
\begin{itemize}
\item \xmlNode{samplerInit}, \xmlDesc{required field}. In this xml-node, the following xml sub-nodes need to be specified:
\begin{itemize}
\item \xmlNode{limit}, \xmlDesc{integer, required field}, number of Metropolis samples needs to be generated;
\item \xmlNode{initialSeed}, \xmlDesc{integer, optional field}, initial seeding of random number generator;
\item \xmlNode{burnIn}, \xmlDesc{integer, optional field}, specifies the number of initial samples that would be discarded.
\end{itemize}
\end{itemize}

In addition to the \xmlNode{variable} nodes, the main XML node
\xmlNode{Metropolis} needs to contain the following supplementary sub-nodes:

\begin{itemize}
\item \xmlNode{likelihood}, \xmlDesc{string, required node}, the output from the user provided likelihood function
This node accept one attribute:
\begin{itemize}
\item \xmlAttr{log}, \xmlDesc{bool, optional field}, indicates whether the the log likelihood value is
provided or not. When True, the code expects to receive the log likehood value.
\default{`False'}
\end{itemize}
\item \assemblerDescription{Metropolis}
\constantSourceDescription{Metropolis}
\begin{itemize}
\item \xmlNode{TargetEvaluation}, \xmlDesc{string, required field},
represents the container where the system evaluations are stored.
%
From a practical point of view, this XML node must contain the name of
a data object defined in the \xmlNode{DataObjects} block (see
Section~\ref{sec:DataObjects}). The object here specified must be
input as \xmlNode{Output} in the Steps that employ this sampling strategy.
%
The Metropolis sampling accepts ``DataObjects'' of type
``PointSet'' only.
\end{itemize}
\restartDescription{Metropolis}
\end{itemize}

Example:
\begin{lstlisting}[style=XML]
<Samplers>
...
<Metropolis name="Metropolis">
<samplerInit>
<limit>1000</limit>
<initialSeed>070419</initialSeed>
<tune>10</tune>
</samplerInit>
<likelihood log="False">zout</likelihood>
<variable name="xin">
<distribution>normal</distribution>
<initial>0</initial>
<proposal class="Distributions" type="Normal">normal</proposal>
</variable>
<variable name="yin">
<distribution>normal</distribution>
<initial>0</initial>
<proposal class="Distributions" type="Normal">normal</proposal>
<!-- <proposal>normal</proposal> -->
</variable>
<TargetEvaluation class="DataObjects" type="PointSet">outSet</TargetEvaluation>
</Metropolis>
...
</Samplers>
\end{lstlisting}
Loading

0 comments on commit 2a94807

Please sign in to comment.