Combined with text-dependent speaker recognition, this project will study noise and short-utterance, which are the bottlenecks for the applications of speaker recognition systems. The project is motivated from the expert voiceprint recognition. It will innovate in both theory and method with the aspects of feature, classifier, data alignment and subspace. The detailed content includes: (1) The auditory noise-robust feature will be investigated to improve the anti-noise ability of the features. (1) The multi-resolution time-frequency feature will be studied in order to provide more multi-resolution time-frequency information. (3) The multi-feature fusion will be investigated in order to give complementary features. (4) The noise-masked kernel method will be studied to improve the anti-noise ability of the models. (5) The multi-scale kernel will be studied in order to enhance the multi-scale matching ability of the model. (6) The multi-kernel learning will be investigated in order to optimize the kernel functions and their combination. (7) The multi-level data alignment will be studied. Resorting to speech recognition, the same content will be aligned together in different levels explicitly. (8) The multi-hierarchy subspace will be explored. Utilizing tensor analysis, the similar content will be shared in different hierarchy implicitly. Through the above research results, the accuracy and robustness of noisy and short-utterance speaker recognition will be improved, and the key problems and bottlenecks of theoretical researches and practical applications will be solved effectively.
本项目结合文本无关的说话人识别,针对噪声和短语音进行创新研究,这是当前说话人识别系统所面临的瓶颈问题。本项目从声纹专家鉴定的过程得到启发,拟在特征、分类器、数据对准和子空间建模等方面进行理论和方法上的创新。具体研究内容包括:(1)研究新型的抗噪听感知特征,提高特征的抗噪能力。(2)研究多分辨率时频特征,在时频分辨率方面提供多种信息。(3)研究多特征融合,在特征层面取长补短。(4)研究噪声屏蔽核方法,提高模型的抗噪能力。(5)研究多尺度核,提高模型多尺度匹配能力。(6)研究多核学习,优化核函数的选择与组合。(7)研究多级别数据对准,借助语音识别在不同级别显式地将相同的内容对准。(8)研究多层次子空间,采用张量分析在不同层次隐式地将相似的内容共享。通过以上研究成果,提高噪声和短语音条件下说话人识别系统的识别率和稳健性,有效解决理论研究和实际应用中需要突破的核心技术和关键问题。
噪声和短语音是目前说话人识别技术走向实用的两大瓶颈,本项目针对这两大瓶颈问题进行研究。在特征层面,采用抗噪听感知特征和多分辨率特征;分类器方面,采用噪声屏蔽核和多尺度核;在进行对比时,借助语音识别在不同级别显式地将相同的内容对准,并利用子空间技术在不同层次隐式地将相似的内容共享,使模型可以根据数据量自动调整。通过以上研究成果,提高噪声和短语音条件下说话人识别系统的识别率和稳健性,有效解决理论研究和实际应用中需要突破的核心技术和关键问题。本项目三项专利已经进行了成果转化,部分成果已在国内某单位实际应用系统,解决了国家的重大需求。
{{i.achievement_title}}
数据更新时间:2023-05-31
涡度相关技术及其在陆地生态系统通量研究中的应用
环境类邻避设施对北京市住宅价格影响研究--以大型垃圾处理设施为例
基于SSVEP 直接脑控机器人方向和速度研究
基于多模态信息特征融合的犯罪预测算法研究
居住环境多维剥夺的地理识别及类型划分——以郑州主城区为例
基于听觉感知模型的说话人识别和语音语种识别新方法研究
基于因子分析的会话语音说话人识别研究
基于电话语音的维吾尔语说话人识别研究
说话人噪声对抗机理研究与窄带语音噪声自适应可懂度增强技术