由于IHT算法需要满足 \({\left\| {\bf{A}} \right\|_2} < 1\),因此算法对 ${\left| {\bf{A}} \right|_2} $ 的缩放很敏感,当不满足此条件时,迭代过程会严重动荡甚至不收敛。此外,IHT算法固定步长为1,无法更好的利用自适应步长加速算法的收敛,而通过调节步长来实现算法收敛加速显然不现实。
基于以上原因,Blumensath, Thomas 等人接下来又提出了正规化硬阈值迭代算法,NIHT算法的优势是利用李普希兹连续条件来规范化步长,使算法不受矩阵 \(\bf{A}\) 缩放的影响。同时采用一维线搜索,推导出自适应步长公式,使迭代次数大幅减少。这种方式以少量可不计的计算量增加换来了算法收敛速度的大幅提升。
NIHT算法具体的原理在此不细述,感兴趣的朋友可以参考文献[2],此处供上原文中的算法伪代码:
NIHT算法程序如下:
function [x_p] = cs_niht(y,A,K,itermax,error,par_c) % Email: zhangyanfeng527@gmail.com % Reference: Blumensath T, Davies M E. Normalized Iterative Hard Thresholding: Guaranteed Stability and Performance[J]. % IEEE Journal of Selected Topics in Signal Processing, 2010, 4(2):298-309. % y: measurement (observe) vector % A: measurement (sensing) matrix % k: sparsity level % itermax: max iteration % error: error threshold % par_c: 0 < par_c < 1 % NIHT算法的好处是使用自适应的步长,不再像IHT那样要求norm(A)<1了 x0 = zeros(size(A,2),1); % initialization with the size of original g0 = A' * y ; u = norm(g0) / norm(A * g0) ; x1 = x0 + u * g0 ; [~,pos1] = sort(abs(x1),'descend') ; x1(pos1(K + 1:end)) = 0 ; gamma0 = find(x1) ; for times = 1 : itermax g1 = A' * (y - A * x1) ; % calculate gradient u = norm(g1(gamma0)) / norm(A(:,gamma0) * g1(gamma0)) ; % calculate step size x_p = x1 + u * g1 ; [~,pos1] = sort(abs(x_p),'descend') ; x_p(pos1(K + 1 : end)) = 0 ; gamma1 = find(x_p) ; % support set W = ( norm ( x_p - x1 ) )^2 / ( norm(A * (x_p - x1)) )^2 ; if isequal(gamma0,gamma1) == 1 elseif isequal(gamma0,gamma1) == 0 % normalized step size if u <= 1 * W elseif u > 1 * W while(u > 1 * W ) u = par_c * u ; end x_p = x1 + u * g1 ; [~,pos1] = sort(abs(x_p),'descend') ; x_p(pos1(K + 1 : end)) = 0 ; gamma1 = find(x_p) ; end end if norm(y - A * x_p) < error break; else x1 = x_p ; % update gamma0 = gamma1 ; end end 3. CGIHT算法尽管NIHT算法在IHT算法的基础上改进了不少地方,但是总的来说,IHT算法和NIHT算法都是基于梯度的算法,只利用了梯度域的信息,而梯度下降法本身收敛速度受限。为了进一步提高IHT算法的收敛速度,Blanchard, Jeffrey D等人提出使用共轭梯度来代替IHT算法中的梯度,并证明了CGIHT算法的收敛速度比IHT算法快。具体可以描述为,IHT算法和NIHT算法是基于梯度的算法,其收敛速度与
\[{{K - 1} \over {K + 1}}\]
有关,而CGIHT算法使用共轭梯度,其收敛速度与
\[{{\sqrt K - 1} \over {\sqrt K + 1}}\]
有关。其中 \(K\) 是矩阵 \({{{\bf{A}}^{\mathop{\rm T}\nolimits} }{\bf{A}}}\) 的条件数。很显然后者小于前者,因此其收敛速度更快。
CGIHT算法的原理比较简单,这里不再详细介绍,下面是原文中CGIHT算法的伪代码:
CGIHT算法程序如下:
function [x_p] = cs_cgiht(y,A,s,itermax,error) % Emial: zhangyanfeng527@gmail.com % Reference: Blanchard J D, Tanner J, Wei K. CGIHT: conjugate gradient iterative hard thresholding % for compressed sensing and matrix completion[J]. % Information & Inference A Journal of the Ima, 2015(4). % y: measurement (observe) vector % A: measurement (sensing) matrix % s: sparsity level % itermax: max iteration % error: error threshold x_0 = .2 * ones(size(A,2),1); % initialization with the size of original g0 = A' * ( y - A * x_0) ; [~,pos0] = sort(abs(g0),'descend') ; g0(pos0(s+1:end)) = 0 ; d0 = g0 ; u = (norm(g0) / norm(A * d0))^2 ; x_1 = x_0 + u * d0 ; [~,pos1] = sort(abs(x_1),'descend'); x_1(pos1(s + 1 : end)) = 0 ; gamma1 = find(x_1) ; %% restart CGIHT algorithm % tao = gamma1 ; % for times = 1 : itermax % g1 = A' * ( y - A * x_1) ; % if isequal(gamma0,tao) == 1 % u = (g1(gamma1)' * g1(gamma1)) / (g1(gamma1)' * A(:,gamma1)' * A(:,gamma1) * g1(gamma1)) ; % x_p = x_1 + u * g1 ; % else % belta = (norm(g1) / (norm(g0)))^2; % d1 = g1 + belta * d0 ; % u = (d1(gamma1)' * g1(gamma1)) / (d1(gamma1)' * A(:,gamma1)' * A(:,gamma1) * d1(gamma1)) ; % x_p = x_1 + u * d1 ; % end % [~,index] = sort(abs(x_p),'descend'); % x_p(index(s + 1 : end)) = 0 ; % if norm(x_p - x_1)/norm(x_p)< error || norm(y - A * x_p)<error % break; % else % gamma0 = gamma1 ; % gamma1 = index ; % x_1 = x_p ; % end % end %% CGIHT algorithm for times = 1 : itermax g1 = A' * ( y - A * x_1) ; belta = (norm(g1) / (norm(g0)))^2 ; d1 = g1 + belta * d0 ; u = (d1(gamma1)' * g1(gamma1)) / (d1(gamma1)' * A(:,gamma1)' * A(:,gamma1) * d1(gamma1)) ; x_p = x_1 + u * d1 ; [~,index] = sort(abs(x_p),'descend') ; x_p(index(s + 1 : end)) = 0 ; if norm(x_p - x_1)/norm(x_p)< error || norm(y - A * x_p)<error break; else x_1 = x_p ; g0 = g1 ; d0 = d1 ; end end end 4. HTP 算法