Sampling Methods
Summary
Rejection Sampling: Key Takeaways
Concept:
- Sample from easy proposal $q(x)$
- Accept with probability $\frac{p(x)}{M \cdot q(x)}$
- Accepted samples follow target $p(x)$
Advantages:
- ✓ Simple to understand and implement
- ✓ Produces independent samples
- ✓ Works for unnormalized targets
- ✓ No tuning required (beyond choosing proposal)
Disadvantages:
- ✗ Requires suitable proposal with $M \cdot q(x) \geq p(x)$
- ✗ Acceptance rate $\frac{1}{M}$ can be very low
- ✗ Curse of dimensionality (exponential degradation)
- ✗ Wastes computation on rejected samples
When to Use:
- Low-dimensional problems (1D, 2D)
- When a good proposal is available
- Educational purposes
- As a component in more advanced methods
Alternatives for Complex Problems:
- Importance Sampling: Weight samples instead of rejecting
- Markov Chain Monte Carlo (MCMC): Metropolis-Hastings, Gibbs sampling
- Variational Inference: Optimization-based approximation
- Normalizing Flows: Learn invertible transformations
def rejection_sampling_nd(dim, n_samples=1000):
"""
Rejection sampling for d-dimensional standard normal using uniform proposal.
Target: p(x) = (2π)^(-d/2) exp(-||x||²/2) (standard normal)
Proposal: q(x) = (2R)^(-d) (uniform in hypercube [-R, R]^d)
"""
R = 3 # Hypercube radius
# For standard normal, max density is at origin
# M = max[p(x)/q(x)] occurs at x=0
p_max = (2 * np.pi) ** (-dim / 2) # p(0) for standard normal
q_uniform = (2 * R) ** (-dim) # uniform density
M = p_max / q_uniform
samples = []
n_proposals = 0
while len(samples) < n_samples:
# Sample from uniform proposal
x = np.random.uniform(-R, R, size=dim)
n_proposals += 1
# Compute target density
p_x = (2 * np.pi) ** (-dim / 2) * np.exp(-0.5 * np.sum(x**2))
# Accept/reject
accept_prob = p_x / (M * q_uniform)
if np.random.uniform() <= accept_prob:
samples.append(x)
acceptance_rate = n_samples / n_proposals
return np.array(samples), acceptance_rate, M
# Test different dimensions
dimensions = [1, 2, 3, 5, 7, 10]
dim_results = []
print("Dimension | Acceptance Rate | M (envelope constant)")
print("-" * 65)
for d in dimensions:
_, acc_rate, M_val = rejection_sampling_nd(d, n_samples=500)
dim_results.append({'dim': d, 'acc_rate': acc_rate, 'M': M_val})
print(f" {d:2d} | {acc_rate:6.2%} | {M_val:12.2e}")
# Plot results
fig, axes = plt.subplots(1, 2, figsize=(14, 5))
dims = [r['dim'] for r in dim_results]
acc_rates = [r['acc_rate'] for r in dim_results]
Ms = [r['M'] for r in dim_results]
# Acceptance rate vs dimension
axes[0].plot(dims, acc_rates, 'bo-', linewidth=2, markersize=8)
axes[0].set_xlabel('Dimension', fontsize=12)
axes[0].set_ylabel('Acceptance Rate', fontsize=12)
axes[0].set_title('Curse of Dimensionality: Acceptance Rate', fontsize=13, fontweight='bold')
axes[0].grid(True, alpha=0.3)
axes[0].set_yscale('log')
axes[0].set_ylim(bottom=1e-5)
# M value vs dimension
axes[1].semilogy(dims, Ms, 'ro-', linewidth=2, markersize=8)
axes[1].set_xlabel('Dimension', fontsize=12)
axes[1].set_ylabel('M (log scale)', fontsize=12)
axes[1].set_title('Envelope Constant Grows Exponentially', fontsize=13, fontweight='bold')
axes[1].grid(True, alpha=0.3)
plt.tight_layout()
plt.show()
print("\n" + "="*65)
print("Conclusion: Rejection sampling becomes impractical in high dimensions!")
print("="*65)
Demonstration: Curse of Dimensionality
Let's demonstrate how acceptance rate degrades with dimension.
Limitations of Rejection Sampling
While simple and elegant, rejection sampling has significant limitations:
1. **Curse of Dimensionality**
In high dimensions, finding a suitable proposal becomes extremely difficult:
- The volume of the "rejection region" grows exponentially
- Acceptance rate $\frac{1}{M}$ decreases exponentially with dimension
- Essentially unusable beyond ~20 dimensions
2. **Requires Bounded Ratio**
We need $M \cdot q(x) \geq p(x)$ for all $x$, which requires:
- $p(x)$ has lighter tails than $q(x)$
- Difficult when $p(x)$ has heavy tails
3. **Wastes Computation**
- Many samples are rejected
- All evaluations of $p(x)$ for rejected samples are wasted
- Inefficient when $p(x)$ is expensive to evaluate
When to Use Rejection Sampling
Best suited for:
- Low dimensions (1D, 2D)
- Simple distributions with known bounds
- Teaching purposes to understand Monte Carlo principles
- When a good proposal is readily available
# Visualize comparison
fig, axes = plt.subplots(2, 2, figsize=(14, 10))
x_plot = np.linspace(0, 1, 500)
target_vals = target_pdf(x_plot)
for idx, (name, result) in enumerate(results.items()):
ax = axes[idx // 2, idx % 2]
# Plot target and proposal envelope
prop_vals = proposals[name]['pdf'](x_plot)
ax.plot(x_plot, target_vals, 'b-', linewidth=2.5, label='Target: Beta(2,5)')
ax.plot(x_plot, result['M'] * prop_vals, 'r--', linewidth=2,
label=f"M·{name} (M={result['M']:.2f})")
ax.fill_between(x_plot, 0, target_vals, alpha=0.3, color='blue')
ax.fill_between(x_plot, target_vals, result['M'] * prop_vals,
alpha=0.2, color='red', label='Rejection region')
ax.set_xlabel('x', fontsize=11)
ax.set_ylabel('Density', fontsize=11)
ax.set_title(f'Proposal: {name}', fontsize=12, fontweight='bold')
ax.legend(fontsize=9)
ax.grid(True, alpha=0.3)
# Add acceptance rate text
ax.text(0.65, ax.get_ylim()[1] * 0.85,
f'Accept: {result["acceptance_rate"]:.1%}',
bbox=dict(boxstyle='round', facecolor='yellow', alpha=0.6),
fontsize=10)
# Remove unused subplot
axes[1, 1].axis('off')
plt.tight_layout()
plt.show()
print("\n" + "="*60)
print("Key Insight: Better proposals (closer to target) have:")
print(" - Lower M (tighter envelope)")
print(" - Higher acceptance rate (less waste)")
print("="*60)
# Compare different proposals for Beta(2,5)
proposals = {
'Uniform(0,1)': {
'sample': lambda: np.random.uniform(0, 1),
'pdf': lambda x: np.ones_like(x) if isinstance(x, np.ndarray) else 1.0
},
'Beta(2,2)': {
'sample': lambda: np.random.beta(2, 2),
'pdf': lambda x: stats.beta.pdf(x, 2, 2)
},
'Beta(2,4)': {
'sample': lambda: np.random.beta(2, 4),
'pdf': lambda x: stats.beta.pdf(x, 2, 4)
}
}
results = {}
x_eval = np.linspace(0.001, 0.999, 1000)
for name, prop in proposals.items():
# Compute M
M_val = np.max(target_pdf(x_eval) / prop['pdf'](x_eval)) * 1.05
# Sample
samples_prop, acc_rate = rejection_sampling(
target_pdf, prop['sample'], prop['pdf'], M_val, n_samples=2000
)
results[name] = {
'M': M_val,
'acceptance_rate': acc_rate,
'samples': samples_prop
}
print(f"{name:15} | M = {M_val:6.3f} | Acceptance rate = {acc_rate:6.2%}")
Impact of Proposal Choice
The choice of proposal distribution $q(x)$ critically affects efficiency. Let's compare different proposals for the same target.
# Visualize mixture sampling
fig, axes = plt.subplots(1, 2, figsize=(14, 5))
x_plot = np.linspace(-5, 5, 500)
target_vals = target_mixture(x_plot)
proposal_vals = proposal_pdf_gauss(x_plot)
# Left: Show envelope
axes[0].plot(x_plot, target_vals, 'b-', linewidth=2, label='Target (Mixture)')
axes[0].plot(x_plot, M_gauss * proposal_vals, 'r--', linewidth=2,
label=f'M·Proposal (M={M_gauss:.2f})')
axes[0].fill_between(x_plot, 0, target_vals, alpha=0.3, color='blue')
axes[0].set_xlabel('x', fontsize=12)
axes[0].set_ylabel('Density', fontsize=12)
axes[0].set_title('Mixture of Gaussians: Proposal Envelope', fontsize=13, fontweight='bold')
axes[0].legend(fontsize=10)
axes[0].grid(True, alpha=0.3)
axes[0].set_xlim(-5, 5)
# Right: Results
axes[1].hist(samples_mixture, bins=60, density=True, alpha=0.6, color='skyblue',
edgecolor='black', label='Samples')
axes[1].plot(x_plot, target_vals, 'b-', linewidth=2, label='True Distribution')
axes[1].set_xlabel('x', fontsize=12)
axes[1].set_ylabel('Density', fontsize=12)
axes[1].set_title('Rejection Sampling: Mixture Results', fontsize=13, fontweight='bold')
axes[1].legend(fontsize=10)
axes[1].grid(True, alpha=0.3)
axes[1].set_xlim(-5, 5)
plt.tight_layout()
plt.show()
# Analyze the bimodality
print(f"\nBimodal distribution analysis:")
print(f"Proportion of samples < 0: {np.mean(samples_mixture < 0):.2%}")
print(f"Proportion of samples > 0: {np.mean(samples_mixture > 0):.2%}")
print(f"(True proportions: 30% at -2, 70% at +2)")
# Define mixture of Gaussians target
def target_mixture(x):
"""Mixture: 0.3*N(-2, 0.5^2) + 0.7*N(2, 0.8^2)"""
component1 = 0.3 * stats.norm.pdf(x, loc=-2, scale=0.5)
component2 = 0.7 * stats.norm.pdf(x, loc=2, scale=0.8)
return component1 + component2
# Use a wide Gaussian as proposal
proposal_mean = 0
proposal_std = 3
proposal_sample_gauss = lambda: np.random.normal(proposal_mean, proposal_std)
proposal_pdf_gauss = lambda x: stats.norm.pdf(x, loc=proposal_mean, scale=proposal_std)
# Find M by evaluating on a grid
x_range = np.linspace(-5, 5, 2000)
ratio = target_mixture(x_range) / proposal_pdf_gauss(x_range)
M_gauss = np.max(ratio) * 1.1 # Add 10% safety margin
print(f"M for Gaussian proposal = {M_gauss:.4f}")
# Run rejection sampling with Gaussian proposal
samples_mixture, accept_rate_gauss = rejection_sampling(
target_mixture, proposal_sample_gauss, proposal_pdf_gauss, M_gauss, n_samples=5000
)
print(f"Acceptance rate with Gaussian proposal: {accept_rate_gauss:.2%}")
Example 2: Sampling from a Mixture of Gaussians
A more challenging example: sample from a bimodal distribution (mixture of two Gaussians).
Target distribution:
$$p(x) = 0.3 \cdot \mathcal{N}(x | -2, 0.5^2) + 0.7 \cdot \mathcal{N}(x | 2, 0.8^2)$$
# Visualize rejection sampling
fig, axes = plt.subplots(1, 2, figsize=(14, 5))
# Left: Show proposal envelope
x_plot = np.linspace(0, 1, 500)
target_vals = target_pdf(x_plot)
proposal_vals = proposal_pdf(x_plot)
axes[0].plot(x_plot, target_vals, 'b-', linewidth=2, label='Target: Beta(2,5)')
axes[0].plot(x_plot, M * proposal_vals, 'r--', linewidth=2, label=f'M·Proposal (M={M:.2f})')
axes[0].fill_between(x_plot, 0, target_vals, alpha=0.3, color='blue')
axes[0].fill_between(x_plot, target_vals, M * proposal_vals, alpha=0.2, color='red')
axes[0].set_xlabel('x', fontsize=12)
axes[0].set_ylabel('Density', fontsize=12)
axes[0].set_title('Rejection Sampling: Proposal Envelope', fontsize=13, fontweight='bold')
axes[0].legend(fontsize=10)
axes[0].grid(True, alpha=0.3)
axes[0].text(0.6, 1.5, f'Acceptance Rate: {accept_rate:.1%}',
bbox=dict(boxstyle='round', facecolor='wheat', alpha=0.5), fontsize=10)
# Right: Compare samples to true distribution
axes[1].hist(samples, bins=40, density=True, alpha=0.6, color='skyblue',
edgecolor='black', label='Samples (histogram)')
axes[1].plot(x_plot, target_vals, 'b-', linewidth=2, label='True Beta(2,5)')
axes[1].set_xlabel('x', fontsize=12)
axes[1].set_ylabel('Density', fontsize=12)
axes[1].set_title('Rejection Sampling Results', fontsize=13, fontweight='bold')
axes[1].legend(fontsize=10)
axes[1].grid(True, alpha=0.3)
plt.tight_layout()
plt.show()
# Compute statistics
print(f"\nSample statistics:")
print(f"Sample mean: {np.mean(samples):.4f}")
print(f"True mean: {alpha/(alpha+beta):.4f}")
print(f"Sample std: {np.std(samples):.4f}")
print(f"True std: {np.sqrt(alpha*beta/((alpha+beta)**2*(alpha+beta+1))):.4f}")
def rejection_sampling(target_pdf, proposal_sample, proposal_pdf, M, n_samples=1000):
"""
Rejection sampling algorithm.
Args:
target_pdf: Target distribution p(x) (must be normalized or unnormalized)
proposal_sample: Function to sample from proposal q(x)
proposal_pdf: Proposal density function q(x)
M: Upper bound constant such that M*q(x) >= p(x) for all x
n_samples: Number of samples to generate
Returns:
samples: Accepted samples from target distribution
acceptance_rate: Fraction of accepted samples
"""
samples = []
n_proposals = 0
while len(samples) < n_samples:
# Sample from proposal
x = proposal_sample()
n_proposals += 1
# Compute acceptance probability
accept_prob = target_pdf(x) / (M * proposal_pdf(x))
# Accept/reject
u = np.random.uniform(0, 1)
if u <= accept_prob:
samples.append(x)
acceptance_rate = n_samples / n_proposals
return np.array(samples), acceptance_rate
# Define target: Beta(2, 5) distribution
alpha, beta = 2, 5
target_pdf = lambda x: stats.beta.pdf(x, alpha, beta)
# Define proposal: Uniform(0, 1)
proposal_sample = lambda: np.random.uniform(0, 1)
proposal_pdf = lambda x: 1.0 # Uniform on [0,1]
# Find M: maximum of p(x) / q(x)
x_grid = np.linspace(0, 1, 1000)
M = np.max(target_pdf(x_grid) / proposal_pdf(x_grid))
print(f"M = {M:.4f}")
# Run rejection sampling
samples, accept_rate = rejection_sampling(
target_pdf, proposal_sample, proposal_pdf, M, n_samples=5000
)
print(f"\nAcceptance rate: {accept_rate:.2%}")
print(f"Theoretical acceptance rate: {1/M:.2%}")
print(f"Number of samples: {len(samples)}")
Example 1: Sampling from a Beta Distribution
Let's sample from a $\text{Beta}(2, 5)$ distribution using a uniform proposal.
Rejection Sampling
Rejection sampling (also called **accept-reject sampling**) is a fundamental Monte Carlo method for sampling from a target distribution $p(x)$ that may be difficult to sample from directly.
The Problem
We want to sample from a target distribution $p(x)$, but:
- We can only evaluate $p(x)$ up to a normalizing constant: $p(x) = \frac{\tilde{p}(x)}{Z_p}$ where $Z_p$ is unknown
- We cannot sample directly from $p(x)$
The Solution
Use an proposal distribution $q(x)$ that is easy to sample from, and accept/reject samples based on the ratio $\frac{p(x)}{q(x)}$.
Algorithm
1. Choose a proposal distribution $q(x)$ and constant $M$ such that:
$$M \cdot q(x) \geq p(x) \quad \forall x$$
2. For each sample:
- Draw $x \sim q(x)$ from the proposal
- Draw $u \sim \text{Uniform}(0, 1)$
- Accept $x$ if $u \leq \frac{p(x)}{M \cdot q(x)}$, otherwise **reject**
3. Return all accepted samples
Key Properties
- Acceptance probability: $\frac{1}{M}$
- Efficiency: Depends on how well $q(x)$ approximates $p(x)$
- Correctness: Accepted samples are distributed according to $p(x)$
The acceptance probability is:
$$\alpha = \int \frac{p(x)}{M \cdot q(x)} q(x) dx = \frac{1}{M} \int p(x) dx = \frac{1}{M}$$
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from scipy import stats
from scipy.special import gamma as gamma_func
np.random.seed(42)
sns.set_style('whitegrid')
plt.rcParams['figure.figsize'] = (10, 6)
Why Sampling?
In many applications, we need to compute expectations of the form:
$$\mathbb{E}_{x \sim p}[f(x)] = \int f(x) p(x) dx$$
However, this integral may be:
- Intractable: No closed-form solution exists
- High-dimensional: Curse of dimensionality makes numerical integration infeasible
- Complex: $p(x)$ may only be known up to a normalizing constant
Monte Carlo approximation solves this by drawing samples $x_1, x_2, \ldots, x_N \sim p(x)$ and approximating:
$$\mathbb{E}_{x \sim p}[f(x)] \approx \frac{1}{N} \sum_{i=1}^N f(x_i)$$
By the Law of Large Numbers, this approximation converges to the true expectation as $N \to \infty$.