Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

Interpolation with Gaussian Processes

We are solving the following problem. Given the values of a function f:[0,1]Rf : [0, 1] \rightarrow \R at fixed points 0x1<<xn10 \leq x_1 < \dots < x_n \leq 1, what can we say about f(x)f(x) for some xx that is distinct from x1,,xnx_1, \dots, x_n?

#Consider the following smooth function: 
def smoothfun(x):
    ans = np.sin(15*x) + 15*np.exp(-(x ** 3)/2) + 0.5*((x - 0.5) ** 2) + 5 * np.log(x + 0.1) + 7
    return ans

#Here is another example of a function: 
#def smoothfun(x):
#    ans = (np.sin(15*x) + 0.6*np.sin(60*x) + 0.4*np.sin(120*x) + 15*np.exp(-(x**3)/2) + 0.5*((x-0.5)**2) + 5*np.log(x+0.1) + 7)
#    return ans

#We observe this function at the following points: 
n = 5
xx = np.linspace(0, 1, n) #these are equally spaced points
xx = np.linspace(0.2, 0.8, n)
#xx = np.sort(np.random.rand(n)) #these are unequally spaced points

#these are the values of the function at the observed points, which we will treat as our data
truth = np.array([smoothfun(x) for x in xx])
y = truth
#Below we plot the scatter plot of the observed data
plt.scatter(xx, truth, label = 'Truth', color = 'red')
plt.legend()
plt.show()
<Figure size 640x480 with 1 Axes>

Interpolation using Brownian Motion

We will use the Brownian motion prior: f(x)=β0+τB(x)f(x) = \beta_0 + \tau B(x) with β0N(0,C)\beta_0 \sim N(0, C) and B(x)B(x) is Brownian motion. We will compute the posterior mean of f(x)f(x) using the formula: kTK1(f(x1),,f(xn))T\mathbf{k}^TK^{-1} (f(x_1), \dots, f(x_n))^T, where the kernel is given by K(u,v)=C+τ2min(u,v)K(u, v) = C + \tau^2 \min(u, v). KK is the matrix whose (i,j)(i, j)-th entry is K(xi,xj)K(x_i, x_j), and k\mathbf{k} is the vector whose ii-th element is K(x,xi)K(x, x_i). This calculation will require specification of τ\tau and CC (CC can be large constant). The value of τ\tau does not affect the posterior mean (as can be easily checked from the code below). We still choose a reasonable value of τ\tau using the observation that:

f(xi)f(xi1)=τ(B(xi)B(xi1))N(0,τ2(xixi1))\begin{align*} f(x_i) - f(x_{i-1}) = \tau \left(B(x_i) - B(x_{i-1}) \right) \sim N(0, \tau^2 (x_i - x_{i-1})) \end{align*}

This means that

E1n1i=2n[1xixi1(f(xi)f(xi1))2]=τ2.\begin{align*} \mathbb{E} \frac{1}{n-1}\sum_{i=2}^n \left[ \frac{1}{x_{i} - x_{i-1}} \left( f(x_i) - f(x_{i-1}) \right)^2 \right] = \tau^2. \end{align*}

This gives a decent estimate of τ\tau which we compute as follows.

#crude estimate of tau
dx = np.diff(xx)
dy = np.diff(y)

tau2_hat = np.mean((dy**2) / dx)
tau_hat = np.sqrt(tau2_hat)

print(tau_hat)
3.943306705404783

With this value of τ\tau and a large value of CC, we compute the posterior mean of f(x)f(x) as follows. We also compute the posterior variance (and the posterior standard deviation). Instead of computing these at a fixed value of xx, we compute them over a grid of values of xx.

# hyperparameters
tau = tau_hat
C = 1000

# kernel
def kernel_BM(s,t):
    return C + tau**2 * min(s,t)

# covariance matrix
K = np.array([[kernel_BM(xi,xj) for xj in xx] for xi in xx])
Kinv = np.linalg.inv(K)

# grid for prediction
grid = np.linspace(0,1,200)

post_mean_bm = np.zeros(len(grid))
post_var_bm = np.zeros(len(grid))

for i,x in enumerate(grid):

    kstar = np.array([kernel_BM(x,xi) for xi in xx])
    kxx = kernel_BM(x,x)

    post_mean_bm[i] = kstar @ Kinv @ truth
    post_var_bm[i] = kxx - kstar @ Kinv @ kstar

post_sd_bm = np.sqrt(post_var_bm)

Below we plot the posterior means of the grid points (as well as twice the posterior standard deviations). The posterior standard deviations give an idea of the uncertainty present in the mean estimates.

truth_grid = np.array([smoothfun(x) for x in grid])
plt.figure(figsize=(12,5))
plt.plot(grid, truth_grid, label="True function", color="red")
plt.scatter(xx,y,color="black",label="observations")

plt.plot(grid, post_mean_bm, label="posterior mean")

plt.fill_between(grid,
                 post_mean_bm-2*post_sd_bm,
                 post_mean_bm+2*post_sd_bm,
                 alpha=0.3,
                 label="95% band")

plt.legend()
plt.show()
<Figure size 1200x500 with 1 Axes>

Below we only plot the posterior mean.

#Only plotting the posterior mean: 
plt.figure(figsize=(12,5))
plt.plot(grid, truth_grid, label="True function", color="red")
plt.scatter(xx,y,color="black",label="observations")
plt.plot(grid, post_mean_bm, label="posterior mean")
plt.legend()
plt.show()
<Figure size 1200x500 with 1 Axes>

Clearly the posterior mean linearly interpolates between the data. Go back and changes x1,,xnx_1, \dots, x_n so they are random points in [0,1][0, 1] which allows us to see that estimates beyond [x1,xn][x_1, x_n] are constant.

Interpolation using Integrated Brownian Motion

Now we use the prior f(x)=β0+β1x+τI(x)f(x) = \beta_0 + \beta_1 x + \tau I(x) where I(x)I(x) is integrated Brownian motion. The code for computing the posterior mean of f(x)f(x) is same as below; the only change is in the formula for K(u,v)K(u, v).

# hyperparameters
tau = 100.0
C = 100000.0

# covariance of integrated Brownian motion
def kernel_IBM(s, t):
    m = min(s, t)
    M = max(s, t)
    return (m**2) * (3*M - m) / 6.0

# full kernel for f(x) = beta0 + beta1 x + tau I(x)
def kernel_f(s, t):
    return C + C*s*t + tau**2 * kernel_IBM(s, t)

# covariance matrix
K = np.array([[kernel_f(xi, xj) for xj in xx] for xi in xx])

# tiny jitter just for numerical stability
K = K + 1e-10 * np.eye(len(xx))

Kinv = np.linalg.inv(K)

# grid for prediction
grid = np.linspace(0, 1, 200)

post_mean_ibm = np.zeros(len(grid))
post_var_ibm = np.zeros(len(grid))

for i, x in enumerate(grid):
    kstar = np.array([kernel_f(x, xi) for xi in xx])
    kxx = kernel_f(x, x)

    post_mean_ibm[i] = kstar @ Kinv @ truth
    post_var_ibm[i] = kxx - kstar @ Kinv @ kstar

# guard against tiny negative values from floating point error
post_var_ibm = np.maximum(post_var_ibm, 0)
post_sd_ibm = np.sqrt(post_var_ibm)
truth_grid = np.array([smoothfun(x) for x in grid])
plt.figure(figsize=(12,5))
plt.plot(grid, truth_grid, label="True function", color="red")
plt.scatter(xx,y,color="black",label="observations")

plt.plot(grid, post_mean_ibm, label="posterior mean")

plt.fill_between(grid,
                 post_mean_ibm-2*post_sd_ibm,
                 post_mean_ibm+2*post_sd_ibm,
                 alpha=0.3,
                 label="95% band")

plt.legend()
plt.show()
<Figure size 1200x500 with 1 Axes>
#Only plotting the posterior mean: 
plt.figure(figsize=(12,5))
plt.plot(grid, truth_grid, label="True function", color="red")
plt.scatter(xx,y,color="black",label="observations")
plt.plot(grid, post_mean_bm, label = 'Brownian Motion', color = 'black')
plt.plot(grid, post_mean_ibm, label = 'Integrated Brownian Motion', color = 'blue')
plt.legend()
plt.show()
<Figure size 1200x500 with 1 Axes>

The interpolant corresponding to this Integrated Brownian Motion prior is smooth (unlike the case of Brownian Motion) between the points. It, in fact, corresponds to a cubic spline interpolant. Beyond the observed data (i.e., before x1x_1 and after xnx_n), the interpolant turns out to be linear.