Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Metropolis Algorithm for MCMC #1262

Merged
merged 36 commits into from
Jul 25, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
36 commits
Select commit Hold shift + click to select a range
4c6900e
add logpdf for Distributions
wangcj05 Jun 25, 2020
00f2cf7
MCMC initial implementation
wangcj05 Jun 26, 2020
e6c1253
update MCMC
wangcj05 Jun 29, 2020
1cb20bf
update
wangcj05 Jun 29, 2020
364053c
update test for MCMC
wangcj05 Jun 29, 2020
10791ab
register MCMC to Simulation and Steps
wangcj05 Jun 29, 2020
a8999a5
updates
wangcj05 Jun 30, 2020
eb79308
first draft implementation
wangcj05 Jun 30, 2020
874eef8
restructure metropolis sampling
wangcj05 Jun 30, 2020
e83077d
enable test for mcmc metropolis algorithm
wangcj05 Jul 1, 2020
73ac3da
clean up and update docstrings
wangcj05 Jul 1, 2020
7c4763c
update tests
wangcj05 Jul 1, 2020
4baa521
rename test
wangcj05 Jul 1, 2020
75a1378
fix unit test for inputData
wangcj05 Jul 1, 2020
a17ae22
Merge remote-tracking branch 'origin/devel' into wangc/mcmc_mh
wangcj05 Jul 1, 2020
fb505af
remove MCMC actor in Steps
wangcj05 Jul 6, 2020
9875a29
move MCMC to Samplers
wangcj05 Jul 6, 2020
d1b9790
update code structure
wangcj05 Jul 6, 2020
a0d2813
complete conversion from MCMC to Metropolis Sampler
wangcj05 Jul 6, 2020
5d61ba4
add xsd for Metropolis
wangcj05 Jul 6, 2020
8f116ea
update test for xsd schema
wangcj05 Jul 6, 2020
be5f93c
address reviewers's comments
wangcj05 Jul 14, 2020
fc73e66
add user manual for MCMC
wangcj05 Jul 14, 2020
8b95e74
finalize the user manual for MCMC
wangcj05 Jul 14, 2020
b179c21
change tune to burnIn
wangcj05 Jul 14, 2020
cc3d861
enable burnIn for SolutionExport
wangcj05 Jul 14, 2020
cafceee
add checks for distributions
wangcj05 Jul 14, 2020
e359266
update user manual
wangcj05 Jul 14, 2020
37740ce
fix extra getDistType in Grid
wangcj05 Jul 14, 2020
87b7177
update test description
wangcj05 Jul 15, 2020
b5b2b68
Merge branch 'devel' into wangc/mcmc
wangcj05 Jul 24, 2020
81dd7c8
add restart capability for Metropolis Sampler
wangcj05 Jul 24, 2020
3338b82
add gold csv for restart
wangcj05 Jul 24, 2020
71dd052
update user manual, accept log likelihood for MCMC
wangcj05 Jul 24, 2020
e8d7af2
update LOGOS submodule ID
wangcj05 Jul 24, 2020
8b43afe
update xsd
wangcj05 Jul 24, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
49 changes: 49 additions & 0 deletions developer_tools/XSDSchemas/Samplers.xsd
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@
<xsd:element name="AdaptiveSobol" type="AdaptiveSobolSampler" minOccurs="0" maxOccurs="unbounded"/>
<xsd:element name="EnsembleForward" type="EnsembleForwardSampler" minOccurs="0" maxOccurs="unbounded"/>
<xsd:element name="AdaptiveMonteCarlo" type="AdaptiveMCSampler" minOccurs="0" maxOccurs="unbounded"/>
<xsd:element name="Metropolis" type="MetropolisSampler" minOccurs="0" maxOccurs="unbounded"/>

</xsd:sequence>
<xsd:attribute name="verbosity" type="verbosityAttr" default="all"/>
Expand Down Expand Up @@ -526,4 +527,52 @@
<xsd:attribute name="mode" type="modeAttr" default="post"/>
<xsd:attribute name="updateGrid" type="RavenBool" default="true"/>
</xsd:complexType>

<!-- *********************************************************************** -->
<!-- Markov Chain Monte Carlo -->
<!-- *********************************************************************** -->
<xsd:complexType name="metropolisInitType">
<xsd:all>
<xsd:element name="limit" type="xsd:string" minOccurs="0"/>
<xsd:element name="initialSeed" type="xsd:integer" minOccurs="0"/>
<xsd:element name="burnIn" type="xsd:string" minOccurs="0"/>
</xsd:all>
</xsd:complexType>

<xsd:complexType name="metropolisVariableType">
<xsd:complexContent>
<xsd:extension base="variableType">
<xsd:sequence>
<xsd:element name="initial" type="xsd:string" minOccurs="0"/>
<xsd:element name="proposal" type="AssemblerObjectType" minOccurs="1" maxOccurs="1"/>
</xsd:sequence>
</xsd:extension>
</xsd:complexContent>
</xsd:complexType>

<xsd:simpleType name="stringBaseType">
<xsd:restriction base="xsd:string"/>
</xsd:simpleType>

<xsd:complexType name="likelihoodType">
<xsd:simpleContent>
<xsd:extension base="stringBaseType">
<xsd:attribute name="log" type="RavenBool"/>
</xsd:extension>
</xsd:simpleContent>
</xsd:complexType>

<xsd:complexType name="MetropolisSampler">
<xsd:sequence>
<xsd:element name="samplerInit" type="metropolisInitType" minOccurs="1"/>
<xsd:element name="likelihood" type="likelihoodType" minOccurs="1" maxOccurs="1"/>
<xsd:element name="variable" type="metropolisVariableType" minOccurs="0" maxOccurs="unbounded"/>
<xsd:element name="TargetEvaluation" type="AssemblerObjectType" minOccurs="1" maxOccurs="1"/>
<xsd:element name="constant" type="constantVarType" minOccurs="0" maxOccurs='unbounded'/>
<xsd:element name="Restart" type="AssemblerObjectType" minOccurs="0" maxOccurs="unbounded"/>
<xsd:element name="restartTolerance" type="xsd:float" minOccurs="0" maxOccurs="1"/>
</xsd:sequence>
<xsd:attribute name="name" type="xsd:string" use="required"/>
<xsd:attribute name="verbosity" type="verbosityAttr" default="all"/>
</xsd:complexType>
</xsd:schema>
127 changes: 127 additions & 0 deletions doc/user_manual/sampler.tex
Original file line number Diff line number Diff line change
Expand Up @@ -3137,3 +3137,130 @@ \subsubsection{Adaptive Sobol Decomposition}
</Samplers>
\end{lstlisting}

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%% Markov Chain Monte Carlo %%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
\subsection{Markov Chain Monte Carlo}
\label{subsec:MCMC}
The Markov chain Monte Carlo (MCMC) is a Sampler entity in the RAVEN framework.
It provides enormous scope for realistic statistical modeling. MCMC is essentially
Monte Carlo integration using Markov chain. Bayesians, and sometimes also frequentists,
need to integrate over possibly high-dimensional probability distributions to make inference
about model parameters or to make predictions. Bayesians need to integrate over the posterior
distributions of model parameters given the data, and frequentists may need to integrate
over the distribution of observables given parameter values. Monte Carlo integration draws
samples from the required distribution, and then forms samples averages to approximate expectations.
MCMC draws these samples by running a cleverly constructed Markov chain for a long time.
There are a large number of MCMC algorithms, and popular families include Gibbs sampling,
Metropolis-Hastings, slice sampling, Hamiltonian Monte Carlo, and many others. Regardless
of the algorithm, the goal in Bayesian inference is to maximize the unnormalized joint
posterior distribution and collect samples of the target distributions, which are marginal
posterior distributions, later to be used for inference.

\subsubsection{Metropolis (Metropolis-Hastings Sampler)}
\label{subsubsubsec:metropolis}
The Metropolis-Hastings (MH) algorithm is a MCMC method for obtaining a sequence of random samples from a probability
distribution from which direct sampling is difficult. This sequence can be used to approximate the distribution or
to compute an integral. It simulates from a probability distribution by making
use of the full joint density function and (independent) proposal distributions for each of
the variables of interest.

\specBlock{an}{Metropolis}
%
\attrIntro

\begin{itemize}
\itemsep0em
\item \nameDescription
\end{itemize}

\variableIntro{Metropolis}

\begin{itemize}
\item \variableDescription
\variableChildrenIntro
\begin{itemize}
\item \distributionDescription
\item \functionDescription
\item \xmlNode{initial}, \xmlDesc{float, optional field}, specified the initial value for given variable.
\item \xmlNode{proposal}, \xmlDesc{Assembler Object}, specifies the proposal distribution for this variable.
This node must contain the following two attributes:
\begin{itemize}
\item \xmlAttr{class}, \xmlDesc{required string attribute}, the main
``class'' of the listed object. Only ``Distributions'' is allowed.
\item \xmlAttr{type}, \xmlDesc{required string attribute}, the object
identifier or sub-type.
\end{itemize}
\end{itemize}
\nb For MCMC sampler, we only allow "continuous" distributions as input to \xmlNode{variable}.
\item \constantVariablesDescription
\end{itemize}


In the \textbf{Metropolis} input block, the user needs to specify the variables need to be sampled.
As already mentioned, these variables are inputted within consecutive xml blocks called \xmlNode{variable}.
In addition, the settings for this sampler need to be specified in the \xmlNode{samplerInit} XML block:
\begin{itemize}
\item \xmlNode{samplerInit}, \xmlDesc{required field}. In this xml-node, the following xml sub-nodes need to be specified:
\begin{itemize}
\item \xmlNode{limit}, \xmlDesc{integer, required field}, number of Metropolis samples needs to be generated;
\item \xmlNode{initialSeed}, \xmlDesc{integer, optional field}, initial seeding of random number generator;
\item \xmlNode{burnIn}, \xmlDesc{integer, optional field}, specifies the number of initial samples that would be discarded.
\end{itemize}
\end{itemize}

In addition to the \xmlNode{variable} nodes, the main XML node
\xmlNode{Metropolis} needs to contain the following supplementary sub-nodes:

\begin{itemize}
\item \xmlNode{likelihood}, \xmlDesc{string, required node}, the output from the user provided likelihood function
This node accept one attribute:
\begin{itemize}
\item \xmlAttr{log}, \xmlDesc{bool, optional field}, indicates whether the the log likelihood value is
provided or not. When True, the code expects to receive the log likehood value.
\default{`False'}
\end{itemize}
\item \assemblerDescription{Metropolis}
\constantSourceDescription{Metropolis}
\begin{itemize}
\item \xmlNode{TargetEvaluation}, \xmlDesc{string, required field},
represents the container where the system evaluations are stored.
%
From a practical point of view, this XML node must contain the name of
a data object defined in the \xmlNode{DataObjects} block (see
Section~\ref{sec:DataObjects}). The object here specified must be
input as \xmlNode{Output} in the Steps that employ this sampling strategy.
%
The Metropolis sampling accepts ``DataObjects'' of type
``PointSet'' only.
\end{itemize}
\restartDescription{Metropolis}
\end{itemize}

Example:
\begin{lstlisting}[style=XML]
<Samplers>
...
<Metropolis name="Metropolis">
<samplerInit>
<limit>1000</limit>
<initialSeed>070419</initialSeed>
<tune>10</tune>
</samplerInit>
<likelihood log="False">zout</likelihood>
<variable name="xin">
<distribution>normal</distribution>
<initial>0</initial>
<proposal class="Distributions" type="Normal">normal</proposal>
</variable>
<variable name="yin">
<distribution>normal</distribution>
<initial>0</initial>
<proposal class="Distributions" type="Normal">normal</proposal>
<!-- <proposal>normal</proposal> -->
</variable>
<TargetEvaluation class="DataObjects" type="PointSet">outSet</TargetEvaluation>
</Metropolis>
...
</Samplers>
\end{lstlisting}
Loading