Speech direction finding is an important intelligent voice preprocessing technique in human-computer interaction such as audio and video conferencing, smart car and smart home. High quality speech direction finding information acquisition and high precision array direction finding in noisy acoustic environment become the first problem in the application of voice communication. Its relevant theory and methods are urgent need to be studied. To this end, the project plans to conduct the following studies on speech: 1)Studying Speech characteristics and sparse representation of speech features. Base on analyses of speech features and spatiotemporal sparsity at the reliable time-frequency points, the sparse decomposition matrix of unvoiced sounds and voiced sounds and their sparse representations are builded. 2)Studying the configuration and model of distributed microphone array based on sparse acoustic features. By designing and analyzing the random configuration of the array, the key factors influencing the array pattern and sparse direction finding are defined, and then the array pattern with the sparse speech features and the relationship between the array pattern and speech sources orientation are established.3)Studying the sparse robust DOA formation of distributed microphone array. According to determining the reliable time-frequency points and their space-time dependency and signal-to-noise ratio features, the sparse robust DOA estimation and the sparse direction finding mathematical model with the maximum signal-to-noise ratio are constructed. The project is expected to reveal the speech features and the mechanism of array sparse direction finding, and a robust direction finding method of distributed microphone array is proposed, and the sparse representation of speech features and accuracy direction finding of distributed microphone array are realized.
语音测向是音视频会议、智能汽车、智能家居等人机交互中重要的语音前处理技术,嘈杂声学环境中高质量语音测向信息获取和精准阵列测向成为语音通信应用首要解决的关键问题,其相关理论方法亟待研究。为此,本项目拟以语音为对象展开以下研究:1)研究语音声特征及稀疏表示。分析有效时频点语音特征及空时稀疏性,构建清音和浊音声特征稀疏分解矩阵,确定不同语音特征的稀疏表示;2)研究稀疏声特征的分布式麦克风阵列构型和模式。设计分析阵列随机构型,明确影响阵列建模及测向的关键因素,构建稀疏声特征阵列模式及语音方位精准映射;3)研究分布式麦克风阵列的稳健稀疏DOA形成。确定可靠时频点及其空时对应性和信噪比特征,构造稀疏DOA估计,建立最大信噪比稀疏DOA测向数学模型。项目预期将揭示语音声特征和阵列稀疏测向机理,提出分布式麦克风阵列稳健测向方法,实现语音声特征稀疏表示和分布式麦克风阵列精准测向。
语音测向是智能语音重要的前处理技术,是对话式人工智能设备的必备基础,在人机交互、智能家居、网络会议、无人驾驶领域具有广阔的应用前景。而高质量语音获取和阵列精准测向成为语音测向亟待解决的关键问题。为此,本项目从语音声特征、分布式阵列构型和稀疏测向三方面展开了相关研究。主要内容包括:语音声特征提取及其稀疏表示、分布式阵列构型机制及波束稀疏重构方法和稀疏测向策略及测量系统。重要科研成果如下:首先,提出了一种语音稀疏化新方法即互相关块稀疏贝叶斯学习方法。提取了语音信号的相关性和块稀疏声特征,构建了语音块划分原则、块稀疏度和块间及块内的相关性量化表达,引入稀疏正则化构造了融合互相关和块稀疏的特征惩罚矩阵和语音贝叶斯稀疏化模型,得出了语音信号的最优稀疏表示,实现了语音高效稀疏化;其次,构造了分布式阵列构型新规则和全变分稀疏重构新机制。提出了基于语音声特征的分区多目标约束主动校准方法,将阵列校准转化为分区多目标约束的非线性数学问题,探求了麦克风多样性及构型随机性关系,形成了分布式阵列快速构型规则。引入全变分稀疏约束,将稀疏重构由信号域转换到梯度域,解决了因稀疏度过大和过小带来的重建精度和运算速度的矛盾,成功实现了分布式阵列输出波束的最优稀疏重构;最后,构建了最大信噪比稀疏测向新策略及新型分布式阵列测向系统。融合语音特征和阵列信噪比,构造了稀疏 DOA 直方图,得到了最大信噪比 DOA 估计,推理了语音声矢量与方位空间的映射关系,建立了DOA估计与语音空间的映射解集,实现了语音空间与稀疏 DOA高效匹配,搭建了16麦克风分布式阵列稀疏测向系统。该项目研究成果不仅提出了高质量语音获取、信号稀疏处理及稀疏DOA估计的新方法,而且突破了分布式阵列构型和精准测向的关键技术,实现了传统智能语音麦克风阵列的新型替代,对进一步提升语音交互设备的智能性具有重要意义。
{{i.achievement_title}}
数据更新时间:2023-05-31
基于一维TiO2纳米管阵列薄膜的β伏特效应研究
基于细粒度词表示的命名实体识别研究
水氮耦合及种植密度对绿洲灌区玉米光合作用和干物质积累特征的调控效应
基于协同表示的图嵌入鉴别分析在人脸识别中的应用
空气电晕放电发展过程的特征发射光谱分析与放电识别
基于信号空域稀疏性的阵列测向方法研究
基于麦克风阵列的语音增强和定位方法研究
近场宽带麦克风阵列语音增强方法及阵列优化拓扑结构的研究
稀疏阵列MIMO雷达低仰角目标测向方法研究