17  Inverse Transform Sampling

In the previous section, we studied how the distribution of a random variable changes when we apply a transformation, focusing on how probability density is redistributed under monotone and non‑monotone mappings. In simulation, however, we often face the inverse problem: rather than starting with a random variable and determining the distribution of its transformation, we want to construct a random variable with a specified distribution in the first place. Inverse transform sampling provides a direct solution to this problem by reversing the logic of transformation.

Starting from a Uniform(0,1) random variable—the fundamental output of most random number generators—we apply an appropriate inverse transformation to obtain samples from a desired distribution. This method makes explicit the connection between transformation theory and simulation practice, and it highlights why understanding how probability behaves under transformations is essential for building simulation algorithms.

We have already seen examples of this idea in practice, such as generating Bernoulli and Exponential random variables from Uniform(0,1) values; inverse transform sampling now formalises this approach and explains why it works in general.

17.1 Discrete Case (Precision Method)

The precision method is simple and widely used for simulating discrete distributions with a small number of possible outcomes. Its main advantages are:

  • Conceptually straightforward
  • Easy to implement
  • Works for any discrete distribution

However, when the number of outcomes becomes very large, more efficient algorithms may be required. In practice, many statistical software packages implement optimised methods internally, but the precision method remains important for understanding how discrete random variables are generated in simulation.

Suppose \(X\) is a discrete random variable that can take values

\[ x_1, x_2, \ldots, x_k \]

with corresponding probabilities

\[ P(X=x_i)=p_i, \qquad i=1,2,\ldots,k, \]

where

\[ p_i \ge 0, \qquad \sum_{i=1}^{k} p_i = 1. \]

Our goal is to generate simulated values of \(X\) so that the outcomes occur with exactly these probabilities.

The precision method works by partitioning the unit interval \((0,1)\) into segments whose lengths correspond to the probabilities \(p_1, p_2, \ldots, p_k\).

Define the cumulative probabilities

\[ F(x_i) = \sum_{j=1}^{i} p_j, \qquad i=1,\ldots,k. \]

These cumulative values divide the interval \((0,1)\) into \(k\) subintervals:

\[ (0, F(x_1)],\quad (F(x_1), F(x_2)],\quad \ldots,\quad (F(x_{k-1}), F(x_k)]. \]

Because \(F(x_k)=1\), these intervals cover the entire unit interval.

The simulation procedure is then straightforward:

  1. Generate a random number \(U \sim \text{Uniform}(0,1)\).
  2. Determine which interval contains \(U\).
  3. Return the corresponding value of \(X\).

Thus,

\[ X = \begin{cases} x_1 & \text{if } 0 < U \le F(x_1) \\ x_2 & \text{if } F(x_1) < U \le F(x_2) \\ \vdots \\ x_k & \text{if } F(x_{k-1}) < U \le 1 \end{cases} \]

This method guarantees that the probability of selecting \(x_i\) is exactly \(p_i\).

Algorithm: Precision Method

The precision method can be implemented using the following steps.

  1. Specify the possible values \(x_1,\ldots,x_k\) and their probabilities \(p_1,\ldots,p_k\).
  2. Compute the cumulative probabilities

\[ F(x_i) = \sum_{j=1}^{i} p_j. \]

  1. Generate a random number \(U \sim \text{Uniform}(0,1)\).
  2. Find the smallest \(i\) such that

\[ U \le F(x_i). \]

  1. Output \(X = x_i\).

Example: Rolling a Loaded Die

Suppose a die has the following probabilities:

Outcome \(x\) 1 2 3 4 5 6
\(P(X=x)\) 0.10 0.15 0.20 0.25 0.20 0.10

First compute the cumulative distribution:

\(x\) \(p(x)\) \(F(x)\)
1 0.10 0.10
2 0.15 0.25
3 0.20 0.45
4 0.25 0.70
5 0.20 0.90
6 0.10 1.00

This corresponds to the following intervals:

Interval Outcome
\(0 < U \le 0.10\) 1
\(0.10 < U \le 0.25\) 2
\(0.25 < U \le 0.45\) 3
\(0.45 < U \le 0.70\) 4
\(0.70 < U \le 0.90\) 5
\(0.90 < U \le 1.00\) 6

For example:

  • If \(U = 0.18\), then \(X=2\).
  • If \(U = 0.63\), then \(X=4\).
  • If \(U = 0.92\), then \(X=6\).
x <- c(1,2,3,4,5,6)
p <- c(0.10,0.15,0.20,0.25,0.20,0.10)

F <- cumsum(p)   # cumulative probabilities

simulate_die <- function() {
  u <- runif(1)
  return(x[which(u <= F)[1]])
}

replicate(10, simulate_die())
 [1] 5 2 4 4 3 5 1 3 5 5
samples <- replicate(10000, simulate_die())

table(samples) / 10000
samples
     1      2      3      4      5      6 
0.1008 0.1503 0.1899 0.2575 0.2005 0.1010 

17.2 Continuous Case

A key theoretical result underlying inverse transform sampling is the Probability Integral Transform. The probability integral transform explains why uniform random variables play a fundamental role in simulation and provides the theoretical justification for constructing new random variables by transforming Uniform(0,1) draws.

For inverse transform sampling, we work directly with the cumulative distribution function. If \(X\) is a continuous random variable with cumulative distribution function \(F_X\), then the transformed variable: \[ U = F_X(X) \sim \text{Uniform}(0,1). \]

Turning it around gives the sampling rule:

\[X = F_X^{-1}(U), \qquad U \sim \text{Uniform}(0,1).\]

This result allows us to turn the theoretical relationship between a random variable and its CDF into a practical simulation algorithm.

General Algorithm

The probability integral transform provides the theoretical justification for inverse transform sampling. We now turn this result into a practical simulation algorithm. The goal is to generate random variables with a specified distribution using only Uniform(0,1) random numbers.

Suppose we want to simulate a continuous random variable with cumulative distribution function \(F_X\), and assume that the inverse CDF \(F_X^{-1}\) is available in closed form. The inverse transform sampling algorithm proceeds as follows:

  1. Generate a random number \[ U \sim \text{Uniform}(0,1). \]

  2. Transform this value using the inverse CDF: \[ X = F_X^{-1}(U). \]

  3. The resulting value \(X\) is a random draw from the target distribution with CDF \(F_X\).

This algorithm works because the inverse transformation exactly reverses the probability integral transform. Since \(F_X(X)\) is uniformly distributed on \([0,1]\), applying the inverse CDF to a Uniform(0,1) random variable reconstructs the original distribution. In this sense, inverse transform sampling is a direct application of transformation theory: it constructs a desired distribution by transforming a simpler one.

Inverse transform sampling is conceptually simple and widely used in simulation, particularly when the inverse CDF has a closed‑form expression. However, not all distributions admit an explicit inverse CDF, which motivates alternative simulation methods introduced later in the course.

We now illustrate this algorithm with a simple example where the inverse CDF can be derived explicitly.

Examples and Limitations

Example (Exponential distribution).

We briefly previewed this idea earlier; here we are using it as a worked example of the general inverse transform algorithm.

Inverse transform sampling is easiest to see when the inverse CDF has a simple closed form. For the Exponential distribution with rate \(\lambda\), the CDF is

\[ F(x)=1-e^{-\lambda x},\qquad x\ge 0. \]

Setting \(U=F(X)\) and solving for \(X\) gives the inverse transform

\[ X = F^{-1}(U)= -\frac{1}{\lambda}\log(1-U),\qquad U\sim \text{Uniform}(0,1). \]

The following R code implements the algorithm in three steps—generate uniforms, apply the inverse CDF, then validate the result by comparing the simulated histogram to rexp(). This mirrors the core simulation workflow:

set.seed(123)

# Step 1: generate Uniform(0,1) values
u <- runif(1000)

# Step 2: transform them into Exponential(lambda)
lambda <- 2   # rate parameter
x <- -log(1 - u) / lambda

# Step 3: Plot and compare with R's built-in rexp()
x_builtin <- rexp(1000, rate = lambda)
hist(x, breaks = 30, col = rgb(0,0,1,0.4), freq = FALSE,
     main = "Comparison of PDF of exponential distribution: Transform vs rexp()")
hist(x_builtin, breaks = 30, col = rgb(1,0,0,0.4), freq = FALSE, add = TRUE)
legend("topright", legend = c("Transform", "rexp()"),
       fill = c(rgb(0,0,1,0.4), rgb(1,0,0,0.4)))

Limitations (why we need other methods).
Inverse transform sampling is conceptually simple, but it has practical limitations:

  1. You need an invertible CDF (or a tractable numerical inverse).
    Many common distributions do not have a neat closed-form inverse CDF, so applying \(F^{-1}\) directly may be difficult or computationally expensive.

  2. Not all transformations are “one-line” like the exponential.
    Even when the CDF exists and is monotone, solving \(u = F(x)\) for \(x\) may not be algebraically possible.

  3. This motivates alternative sampling techniques.
    When \(F^{-1}\) is unavailable or inefficient, we use other approaches (e.g., acceptance–rejection or specialised transformations such as Box–Muller for Normal simulation).