对于每种训练混合物,对x(n)进行处理以获得xNL(n),然后将此非线性处理的远端信号与从6个RIR中随机选择的RIR卷积,以生成回波信号d(n)。 SER设置为3.5 dB,白噪声以10 dB SNR的水平添加到混合物中。
图3说明了使用基于BLSTM的方法的回声消除示例。 可以看出,基于BLSTM的方法的输出类似于干净的近端信号,这表明该方法可以很好地保留近端信号,同时抑制背景噪声和非线性失真的回声。
我们将提出的BLSTM方法与基于DNN的残余回声抑制(RES)进行了比较[11],结果如表3所示。在我们实现AES + DNN的过程中,AES和DNN的参数设置为[ 11]。 SNR = 1的情况,这是在[11]中评估的情况,表明基于DNN的RES可以处理回波的非线性分量并提高AES的性能。 当涉及到背景噪声的情况时,将基于DNN的RES添加到AES在PESQ值方面显示出较小的改进。 仅基于BLSTM的方法就胜过AES + DNN.ERLE方面提高了约5.4 dB,PESQ方面提高了0.5 dB。 如果我们遵循[11]中提出的方法,并将AES作为预处理器添加到BLSTM系统中,即AES + BLSTM,则可以进一步提高性能。 此外,从表3中可以看出,所提出的BLSTM方法可以推广到未经训练的说话者。
表3:在3.5 dB SER的双向通话,背景噪声和非线性失真情况下的平均ERLE和PESQ值,SNR = $\infty $表示无背景噪声
4 总结提出了一种基于BLSTM的有监督声回声消除方法,以解决双向通话,背景噪声和非线性失真的情况。 所提出的方法显示了其消除声学回声并将其推广到未经训练的扬声器的能力。 未来的工作将将该方法用于解决其他AEC问题,例如多通道通信。
6 参考文献[1] J. Benesty, T. G ansler, D. R. Morgan, M. M. Sondhi, S. L. Gay et al., Advances in network and acoustic echo cancellation. Springer, 2001.
[2] J. Benesty, C. Paleologu, T. G ansler, and S. Ciochin a, A perspective on stereophonic acoustic echo cancellation. Springer Science & Business Media, 2011, vol. 4.
[3] G. Enzner, H. Buchner, A. Favrot, and F. Kuech, Acoustic echo control, in Academic Press Library in Signal Processing. Elsevier, 2014, vol. 4, pp. 807 877.
[4] D. Duttweiler, A twelve-channel digital echo canceler, IEEE Transactions on Communications, vol. 26, no. 5, pp. 647 653, 1978.
[5] M. Hamidia and A. Amrouche, A new robust double-talk detector based on the stockwell transform for acoustic echo cancellation, Digital Signal Processing, vol. 60, pp. 99 112, 2017.
[6] V. Turbin, A. Gilloire, and P. Scalart, Comparison of three post-filtering algorithms for residual acoustic echo reduction, in Acoustics, Speech, and Signal Processing, 1997. ICASSP-97., 1997 IEEE International Conference on, vol. 1. IEEE, 1997, pp. 307 310.
[7] F. Ykhlef and H. Ykhlef, A post-filter for acoustic echo cancellation in frequency domain, in Complex Systems (WCCS), 2014 Second World Conference on. IEEE, 2014, pp. 446 450.
[8] F. Kuech and W. Kellermann, Nonlinear residual echo suppression using a power filter model of the acoustic echo path, in Acoustics, Speech and Signal Processing, 2007. ICASSP 2007. IEEE International Conference on, vol. 1. IEEE, 2007, pp. 73 76.
[9] A. Schwarz, C. Hofmann, and W. Kellermann, Spectral featurebased nonlinear residual echo suppression, in Applications of Signal Processing to Audio and Acoustics (WASPAA), 2013 IEEE Workshop on. IEEE, 2013, pp. 1 4.
[10] J. Malek and Z. Koldovsk`y, Hammerstein model-based nonlinear echo cancellation using a cascade of neural network and adaptive linear filter, in Acoustic Signal Enhancement (IWAENC), 2016 IEEE International Workshop on. IEEE, 2016, pp. 1 5.
[11] C. M. Lee, J. W. Shin, and N. S. Kim, Dnn-based residual echo suppression, in Sixteenth Annual Conference of the International Speech Communication Association, 2015.
[12] F. Yang, M. Wu, and J. Yang, Stereophonic acoustic echo suppression based on wiener filter in the short-time fourier transform domain, IEEE Signal Processing Letters, vol. 19, no. 4, pp. 227 230, 2012.
[13] J. M. Portillo, Deep Learning applied to Acoustic Echo Cancellation, Master s thesis, Aalborg University, 2017.
[14] D. L. Wang and J. Chen, Supervised speech separation based on deep learning: an overview, arXiv preprint arXiv:1708.07524, 2017.
[15] Y. Wang, A. Narayanan, and D. L. Wang, On training targets for supervised speech separation, IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP), vol. 22, no. 12, pp. 1849 1858, 2014.
[16] S. Hochreiter and J. Schmidhuber, Long short-term memory, Neural computation, vol. 9, no. 8, pp. 1735 1780, 1997.
[17] H. Erdogan, J. R. Hershey, S. Watanabe, and J. Le Roux, Phasesensitive and recognition-boosted speech separation using deep recurrent neural networks, in Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on. IEEE, 2015, pp. 708 712.