$\displaystyle \sum\limits_{j=1}^m n_j = \sum\limits_{j=1}^m\sum\limits_{i=1}^n \frac{\alpha_j^k\phi(y_i|\theta_j^k)} {\sum\limits_{l=1}^m \alpha_l^k\phi(y_i|\theta_l^k)}=\sum\limits_{i=1}^n \sum\limits_{j=1}^m \frac{\alpha_j^k\phi(y_i|\theta_j^k)} {\sum\limits_{l=1}^m \alpha_l^k\phi(y_i|\theta_l^k)} =\sum\limits_{i=1}^n 1 = n$
解得
$\displaystyle\alpha_j = \frac{n_j}{n} = \frac{1}{n}\sum\limits_{i=1}^n \frac{\alpha_j^k\phi(y_i|\theta_j^k)} {\sum\limits_{l=1}^m \alpha_l^k\phi(y_i|\theta_l^k)}$
计算σ与μ与$\alpha$不同,它的方程组是所有$\alpha_j$之间联立的;而$\sigma,\mu$的方程组则是$\sigma_j$与$\mu_j$之间联立的。定义
$\displaystyle p_{ji} = \frac{\alpha_j^k\phi(y_i|\theta_j^k)} {\sum\limits_{l=1}^m \alpha_l^k\phi(y_i|\theta_l^k)}$
则对于$\sigma_j,\mu_j$,优化式为(比较$(6),(7)$式的区别)
$\begin{gather}\displaystyle\min\limits_{\sigma_j,\mu_j}\sum\limits_{i=1}^n p_{ji} \left(\log \sigma_j+\frac{(y_i-\mu_j)^2}{2\sigma_j^2} \right)\label{}\end{gather}$
对上式求导等于0,解得
$ \begin{aligned} &\mu_j = \frac{\sum\limits_{i=1}^np_{ji}y_i}{\sum\limits_{i=1}^np_{ji}} = \frac{\sum\limits_{i=1}^np_{ji}y_i}{n_j} = \frac{\sum\limits_{i=1}^np_{ji}y_i}{n\alpha_j}\\ &\sigma^2_j = \frac{\sum\limits_{i=1}^np_{ji}(y_i-\mu_j)^2}{\sum\limits_{i=1}^np_{ji}} = \frac{\sum\limits_{i=1}^np_{ji}(y_i-\mu_j)^2}{n_j} = \frac{\sum\limits_{i=1}^np_{ji}(y_i-\mu_j)^2}{n\alpha_j} \end{aligned} $
代码实现对于概率密度为$P(x) = −2x+2,x\in (0,1)$的随机变量,以下代码实现GMM对这一概率密度的的拟合。共10000个抽样,GMM混合了100个高斯分布。
#%%定义参数、函数、抽样 import numpy as np import matplotlib.pyplot as plt dis_num = 100 #用于拟合的分布数量 sample_num = 10000 #用于拟合的分布数量 alphas = np.random.rand(dis_num) alphas /= np.sum(alphas) mus = np.random.rand(dis_num) sigmas = np.random.rand(dis_num)**2#方差,不是标准差 samples = 1-(1-np.random.rand(sample_num))**0.5 #样本 C_pi = (2*np.pi)**0.5 dis_val = np.zeros([sample_num,dis_num]) #每个样本在每个分布成员上都有值,形成一个sample_num*dis_num的矩阵 pij = np.zeros([sample_num,dis_num]) #pij矩阵 def calc_dis_val(sample,alpha,mu,sigma,c_pi): return alpha*np.exp(-(sample[:,np.newaxis]-mu)**2/(2*sigma))/(c_pi*sigma**0.5) def calc_pij(dis_v): return dis_v / dis_v.sum(axis = 1)[:,np.newaxis] #%%优化 for i in range(1000): print(i) dis_val = calc_dis_val(samples,alphas,mus,sigmas,C_pi) pij = calc_pij(dis_val) nj = pij.sum(axis = 0) alphas_before = alphas alphas = nj / sample_num mus = (pij*samples[:,np.newaxis]).sum(axis=0)/nj sigmas = (pij*(samples[:,np.newaxis] - mus)**2 ).sum(axis=0)/nj a = np.linalg.norm(alphas_before - alphas) print(a) if a< 0.001: break #%%绘图 plt.rcParams['font.sans-serif']=['SimHei'] #用来正常显示中文标签 plt.rcParams['axes.unicode_minus']=False #用来正常显示负号 def get_dis_val(x,alpha,sigma,mu,c_pi): y = np.zeros([len(x)]) for a,s,m in zip(alpha,sigma,mu): y += a*np.exp(-(x-m)**2/(2*s))/(c_pi*s**0.5) return y def paint(alpha,sigma,mu,c_pi,samples): x = np.linspace(-1,2,500) y = get_dis_val(x,alpha,sigma,mu,c_pi) fig = plt.figure() ax = fig.add_subplot(111) ax.hist(samples,density = True,label = '抽样分布') ax.plot(x,y,label = "拟合的概率密度") ax.legend(loc = 'best') plt.show() paint(alphas,sigmas,mus,C_pi,samples)