Monte Carlo integration - Wikipedia
Monte Carlo Integration is a technique for numerical integration using random numbers. It is a type of Monte Carlo methods that can be used to numerically compute a definite integral. The main differing point of such integration is that while standard methods use a regular interval to evaluate the integrand, Monte Carlo uses a random set of points to evaluate. This method is particularly useful for higher-dimensional integrals. There are various methods to perform such an integration: Uniform sampling, stratified sampling, importance sampling, sequential Monte Carlo, and mean field particle methods. We’ll be taking a look at the uniform sampling, and importance sampling methods in brief.
1. Overview
At the core of it, Monte Carlo is an approximation method used to approximate the value of the integral as compared to the deterministic approach of methods like the trapezoidal-rule. Each simulation of a monte carlo integral provides a different outcome, which can be averaged over multiple simulations.
Let I be a multidimensional definite integral defined as
\[ I=\int_{a}^{b}f(x)dx \]
and a random variable \(X_i ~ p(x)\) where \(p(x)\) must be nonzero for all \(x\) where \(f(x)\) is nonzero. Then, the Monte Carlo estimator is defined as
\[ F_{N} = \frac{1}{N}\sum_{i=1}^{N}\frac{f(X_i)}{p(X_i)} \]
The value of \(I\) can be estimated by taking an average of several such Monte Carlo estimator values.
2. Basic Monte Carlo Estimator
The basic monte carlo estimator is a special case of Importance Sampling Estimator case where we sample the points from a uniform random variable, to calculate the integral. Therefore, \(X_i ~ p(x) = c\). This follows that for interval \((a, b)\), the value of \(c = \frac{1}{b-a}\). Therefore, the Monte Carlo estimator then becomes
\begin{align*} F_{N} &= \frac{1}{N}\sum_{i=1}^{N}\frac{f(X_i)}{p(X_i)} \\ &= \frac{1}{N} \sum_{i=1}^{N}\frac{f(X_i)}{1/(b-a)}\\ &= \frac{b-a}{N}\sum_{i=1}^{N}f(X_i) \end{align*}We can also extend this to be N-Dimensional. For example, a 3D basic estimator for an integral
\begin{equation} \int_{x_0}^{x_1}\int_{y_0}^{y_1}\int_{z_0}^{z_1}f(x,y,z)dx dy dz \end{equation}would be defined as
\begin{equation} F_{N} = \frac{(x_1 - x_0)(y_1 - y_0)(z_1 - z_0)}{N}\sum_{i=1}^{N}f(X_i) \end{equation}Therefore, a general rule can be written as follows. For an n-dimensional integral
\begin{equation*} \int_{x_1}^{y_1}\int_{x_2}^{y_2}\dots \int_{x_n}^{y_n}f(x_1, x_2, \dots, x_n)dx_1 dx_2\dots dx_n \end{equation*}the MC Estimator is defined as
\begin{equation*} F_{N} = \frac{\prod_{i=1}^{n}(y_i - x_i)}{N} \sum_{i=1}^{N}f(x_1, x_2, \dots, x_n) \end{equation*}where \(N\) is the number of samples that are taken from the uniform distribution for evaluation.
2.1. Simulation
This method can be simulated fairly easily using python. Let us try and integrate the following function for \(0.8 < x < 3\)
\begin{equation*} f(x) = \frac{1}{1 + \sinh(2x)\log(x)^2} \end{equation*}To do this, we’ll first have to define a python function using numpy
import numpy as np def f(x): return 1/(1+np.sinh(2*x)*np.log(x)**2)
Now that we have a function that calculates the value of \(f(x)\) at a given set of points, let us start with the MC estimator. First we will define the limits of the integral \((a, b)\) and the number of estimators \(N\). We will also define the number of points that are sampled.
n_estimators = 100 N = 100 a, b = 0.8, 3
Now, we need to perform the calculation for each MC estimator and find the average. We can do this more efficiently by using numpy’s vector operations and random number generator.
rng = np.random.default_rng() r = rng.uniform(a, b, size=(n_estimators, N)) result = (1/n_estimators)*((b-a)/N)*(np.sum(f(r))) result
0.6786189790691812
We can check this result by comparing it with the function scipy.integrate.quad
from scipy import integrate integrate.quad(f, a, b)[0]
0.6768400757156462
As we can see, the two results are fairly similar. Do note that the result due to MC estimators is bound to change but it is still a fairly close estimate to the integration function from scipy.
3. Importance Sampling
The formula for a MC estimator that we saw above was for an importance sampling estimator. What it means is that, instead of choosing random points over an interval with uniform probability, we try to sample points based on its expected contribution to the integral. This means that instead of a uniform distribution, we use a distribution \(p(x)\) of our choice that we hope makes the calculation more efficient. The intuition behind this is that if a particular point \(x_i\) is picked up with a higher probability, then we weigh it down by a factor of its probability \(p(x_i)\).