Monte Carlo Integration & MCMC

Monte Carlo methods are among the most important computational tools in modern statistics, science, engineering, and machine learning. Many real-world systems are too complex to analyse exactly using algebra or calculus alone. Instead of seeking closed-form solutions, Monte Carlo methods use repeated random sampling to approximate quantities of interest such as probabilities, expectations, integrals, and predictive outcomes. From estimating the reliability of engineering systems to modelling financial risk and performing Bayesian inference, Monte Carlo simulation has become a fundamental approach to computational problem solving.

In earlier chapters, we developed the foundations necessary for simulation studies: generating random variables, modelling stochastic systems, simulating multivariate dependence, and analysing random processes. Throughout the unit, simulation has relied on a central principle:

if we can generate samples from a probability distribution, then repeated sampling allows us to approximate otherwise difficult mathematical quantities.

This chapter extends that idea further by introducing Monte Carlo integration and Markov chain Monte Carlo (MCMC) methods.

The chapter begins with Monte Carlo integration, where integration problems are reformulated as expectations and approximated through random sampling. In Chapter 1, we examined the role of the Law of Large Numbers and the Central Limit Theorem in explaining why Monte Carlo estimators work and how their accuracy improves with increasing sample size. Since naive Monte Carlo methods can sometimes be inefficient, particularly for rare-event estimation or high-dimensional problems, we also introduce the intuition behind variance reduction and importance sampling techniques.

A major challenge in practice is that direct sampling from complicated probability distributions is often impossible. This issue arises frequently in Bayesian statistics, where posterior distributions may be known only up to a proportionality constant. To overcome this difficulty, the chapter introduces Markov chain Monte Carlo (MCMC) methods, which generate dependent samples using carefully constructed Markov chains. Rather than sampling directly from the target distribution, MCMC methods simulate a stochastic process whose long-run behaviour converges to the desired distribution.

Particular attention is given to the Gibbs sampler, one of the most widely used MCMC algorithms. Using conditional distributions, the Gibbs sampler allows complex multivariate distributions to be explored through simpler iterative updates. We also introduce the Metropolis and Metropolis–Hastings algorithms, which extend MCMC ideas to situations where conditional sampling is not straightforward. Practical examples involving regression and Bayesian modelling demonstrate how these methods are applied in modern statistical computation.

After completing this chapter, you should be able to:

explain the relationship between Monte Carlo simulation and numerical integration;
construct Monte Carlo estimators and interpret their accuracy;
understand the role of the Central Limit Theorem in Monte Carlo error analysis;
describe the motivation for variance reduction and importance sampling;
explain the key ideas underlying Markov chain Monte Carlo methods;
implement and interpret Gibbs sampling algorithms;
understand the intuition behind Metropolis and Metropolis–Hastings algorithms; and
apply MCMC methods to simple Bayesian modelling problems.

This chapter represents a transition from classical simulation toward modern computational statistics. It highlights how randomness, repetition, and stochastic modelling can be combined to solve problems that are analytically intractable, forming the foundation of many contemporary statistical and machine learning methods.