Skip to main content

Single-Site Random Walk Metropolis-Hastings

Random Walk Metropolis-Hastings is a simple, minimal MCMC inference method. Random Walk Metropolis-Hastings is single-site by default, following the philosophy of most inference methods in Bean Machine, and accordingly multi-site inference patterns are well supported. Random Walk Metropolis-Hastings follows the standard Metropolis-Hastings algorithm of sampling a value from a proposal distribution, and then running accept-reject according to the computed ratio of the proposed value. This is further detailed in the docs for Ancestral Metropolis-Hastings. This tutorial describes the proposal mechanism, describes adaptive Random Walk Metropolis-Hastings, and documents the API for the Random Walk Metropolis-Hastings algorithm.

Algorithm​

Random Walk Metropolis-Hastings works on a single-site basis by proposing new values for a random variable that are close to the current value according to some sense of distance. As such, it is only defined for continuous random variables. The exact distance that a proposed value is from the current value is defined by the proposal distribution, and is a parameter that can be provided when configuring the inference method. For discrete random variables, a similar effect may be achieved, but custom proposers must be used instead.

The Random Walk Metropolis-Hastings algorithm has multiple proposers defined on different spaces such as all real numbers, positive real numbers, or intervals of the real numbers. These proposers all have common properties used to propose a new value xβ€²x^\prime from a current value xx. The proposal distribution q(x,xβ€²)q(x,x^\prime) is constructed to satisfy the following properties:

E[q(x,β‹…)]=xV[q(x,β‹…)]=Οƒ2\begin{aligned} \mathbb{E}[q(x, \cdot)] &= x \\ \mathbb{V} [q(x, \cdot)] &= \sigma^2 \end{aligned}

Οƒ\sigma is the parameter that may be provided as a parameter when configuring the inference method, and it must be a fixed positive number. Larger values of Οƒ\sigma will cause the inference method to explore more non-local values for XX. This may be good for faster exploration of the posterior, but it may cause lower probability values to get proposed (and therefore rejected) as a result.

Adaptive Random Walk Metropolis-Hastings​

Selecting a good Οƒ\sigma value is important for efficient posterior exploration. However, it is often challenging for a user to select a good Οƒ\sigma value, as it requires a nuanced understanding of the posterior space. Consequently, Bean Machine provides an adaptive version of Random Walk Metropolis-Hastings, in which the inference engine automatically tunes the value of Οƒ\sigma during the first few samples of inference (known as the adaptation period).

The Random Walk Metropolis-Hastings algorithm is an exemplar use of the Bean Machine pattern for Adaptive inference, and this is enabled by using the argument num_adaptive_samples in the call to infer(). This causes Bean Machine to run an adaptation phase at the beginning of inference for the provided number of samples. During this phase, Bean Machine will internally tweak values of Οƒ\sigma in order to find the largest value that still results in a relatively low number of rejected proposals. Technically speaking, Random Walk adaptation will attempt to achieve an amortized acceptance rate of 0.234. How this value is chosen as the optimal acceptance rate is detailed in Optional Scaling and Adaptive Markov Chain Monte Carlo.

Please note that samples taken during adaptation are not valid posterior samples, and so will not be shown by default when using the MonteCarloSamples object returned from inference.

Usage​

The following code snippet illustrates how to use the inference method. Here, step_size represents Οƒ\sigma from the algorithm above.

samples = bm.SingleSiteRandomWalk(
step_size = 2.0,
).infer(
queries,
observations,
num_samples,
num_adaptive_samples = 500,
)

The parameters to infer are described below:

NameUsage
queriesA List of @bm.random_variable targets to fit posterior distributions for.
observationsThe Dict of observations. Each key is a random variable, and its value is the observed value for that random variable.
num_samplesNumber of samples to build up distributions for the values listed in queries.
num_chainsNumber of separate inference runs to use. Multiple chains can be used by diagnostics to verify inference ran correctly.