with p known and fixed (e.g., p=0.3 as in the simulations) and μ1,μ2 being unknown. The Gibbs sampler formulae can be easily generalized to more realistic cases (e.g., when p is unknown, variances are different from 1 etc. as we will see later). The log-likelihood corresponding to (1) is:
This posterior cannot be evaluated in closed form and numerical methods need to be used. The Gibbs sampler is one computational tool that is applicable here. The standard approach of using the Gibbs sampler in this problem proceeds via augmentation. This is explained below.
First observe that the model (1) can be rewritten in the following way:
zi∼i.i.dBernoulli(p) and yi∣zi=1∼N(μ1,1) and yi∣zi=0∼N(μ2,1).
It should be clear that, under the above model, the marginal distribution of yi coincides with (1). z1,…,zn can be thought of as unobserved latent variables which represent which of the two populations (corresponding to the distributions N(μ1,1) and N(μ2,1) respectively) the observation yi comes from.
Gibbs sampler is implemented for jointly sampling from the posterior of μ1,μ2,z1,…,zn given the data. This requires being able to sample from the full conditionals
where y=(y1,…,yn) is the data and z=(z1,…,zn). It is easy to see that these full conditionals can be written in closed form as follows. Given μ1,μ2,y1,…,yn, the variables z1,…,zn are independent with
Based on these full conditional distributions, the Gibbs sampler can be easily implemented. We shall give the exact form of the algorithm in the next lecture. It also turns out that the EM algorithm for computing the MLE is similar to the Gibbs sampler. We shall also look at this in the next lecture.