频率压缩语音信号识别机理的研究

基本信息
批准号:61571213
项目类别:面上项目
资助金额:57.00
负责人:陈霏
学科分类:
依托单位:南方科技大学
批准年份:2015
结题年份:2019
起止时间:2016-01-01 - 2019-12-31
项目状态: 已结题
项目参与者:程庆沙,Lena Wong,Yi Hu,朱淑丰,彭诚,张丹
关键词:
助听语音增强
结项摘要

Nowadays hearing impairment is the most frequent sensory deficit in human. Frequency-compression technology recently attracts substantial research interests, which compresses the frequency band critical for speech recognition into the residual low-frequency range perceivable by hearing-impaired patients so as to elicit speech recognition. Compared with traditional hearing aids and cochlear implants, the hearing devices based on frequency-compression technology feature their advantages of perceiving high-frequency voice and low-cost. However, a number of important studies in the field of recognizing frequency-compressed speech have not been conducted. The achievements of this project are of great significance for us to study the mechanisms responsible for frequency-compressed speech perception, and design next-generation low-cost frequency-compression based assistive hearing devices optimized for Mandarin speech perception. These will benefit the hearing rehabilitation, speech communication and the quality of life of a huge amount of hearing-impaired patients...This project aims to systematically study factors affecting the performance of understanding frequency-compressed speech. More specifically, we will 1) investigate how acoustic features carried by speech signal affect the recognition rate of frequency-compressed speech, examine how background noise and dynamic-range compression of speech signal influence the understanding of frequency-compressed speech, and explore the contributions of speech enhancement preprocessing and adaptive dynamic-range compression to the recognition rate; 2) establish an objective intelligibility index to predict the intelligibility of frequency-compressed speech; 3) study the language-specific (Mandarin vs. English) difference accounting for the intelligibility of frequency-compressed speech; and 4) evaluate the effect of auditory training to improve the understanding of frequency-compressed speech...This project aims to publish 6 journal articles in international peer-reviewed journals and 12 conference papers, apply 1 national invention patent, and co-organize 2 international academic activities (e.g., workshop and special session in international conference). In addition, this project will co-supervise 1 post-doctor research fellow, 2 Ph.D. and 3 master students.

听力损伤是排名第一的神经传导型功能疾病。频率压缩技术是当前助听设备研发的热点,其充分运用听障病人的低频残余听力,将语音信号压缩到低频区间来促进言语识别。和传统助听器及电子耳相比,基于频压技术的助听设备具有有效识别语音高频成分和低成本的优势。但是,频压语音信号的识别机理尚未被深入研究,本课题研究成果对于充分了解频压语音信号的识别机理,和设计新型低成本、优化汉语识别性能的助听设备有重要意义,有助于听障病人的言语交流和生活质量提高。具体而言,本课题将研究1)语音信号携带的重要声学参数和其它因素(噪声和动态范围压缩)对于频压语音信号识别率的影响;2)客观估计识别率的模型;3)识别率的语言性(汉、英语间)差异;和4)听觉训练对于提升识别率的贡献。本研究课题预计发表6篇国际期刊文章和12篇会议文章,申请1 项国家发明专利,联合培养1名博士后、2名博士生和3名硕士生,联合组织2次国际学术交流活动。

项目摘要

听力损伤是排名第一的神经传导型功能疾病。频率压缩技术是当前助听设备研发的热点,其充分运用听障病人的低频残余听力,将语音信号压缩到低频区间来促进言语识别。和传统助听器及电子耳相比,基于频压技术的助听设备具有有效识别语音高频成分和低成本的优势。但是,频压语音信号的识别机理尚未被深入研究,本课题研究成果对于充分了解频压语音信号的识别机理,和设计新型低成本、优化汉语识别性能的助听设备有重要意义,有助于听障病人的言语交流和生活质量提高。. 本项目研究了1)语音信号携带的重要声学参数和其它因素(噪声和动态范围压缩)对于频压语音信号识别率的影响;2)客观估计识别率的模型;3)识别率的语言性(汉、英语间)差异;和4)听觉训练对于提升识别率的贡献。. 研究结果发现,汉语的韵母区间对频率压缩语音信号的识别率具有重要作用,保留韵母区间可以获得较高的识别率;声韵母过渡区间、高能量区间对于识别率也有较重要影响;目前的单通道语音降噪算法不能提升低频残留听力下的言语识别率,同时,低频残留听力下的言语识别率主要来自韵母的贡献,并且通过压缩第一、二共振峰的过渡带到低频区间,可以提升低频残留听力下的言语识别率;基频轨迹对于汉语普通话频率压缩语音信号识别的模型而言是重要的声学线索,不再是冗余的声学信息;中文语音信号的(第一、第二、第三)三个共振峰中,第二共振峰携带更多的言语清晰度信息;通过听觉诱发脑电信号可以提取识别率相关的重要特征;听觉训练有助于受试熟悉处理后语音的音质和提升识别率。. 本项目发表了20篇国际期刊文章和18篇国际会议文章,申请了2项国家发明专利,培养了2名博士后、2名博士生和3名硕士生,联合组织了2次国际学术交流活动。

项目成果
{{index+1}}

{{i.achievement_title}}

{{i.achievement_title}}

DOI:{{i.doi}}
发表时间:{{i.publish_year}}

暂无此项成果

数据更新时间:2023-05-31

其他相关文献

1

低轨卫星通信信道分配策略

低轨卫星通信信道分配策略

DOI:10.12068/j.issn.1005-3026.2019.06.009
发表时间:2019
2

面向精密位置服务的低轨卫星轨道预报精度分析

面向精密位置服务的低轨卫星轨道预报精度分析

DOI:10.11947/j.agcs.2022.20210473
发表时间:2022
3

烷基膦酸促进负载磷钨酸催化异丁烷/丁烯烷基化反应

烷基膦酸促进负载磷钨酸催化异丁烷/丁烯烷基化反应

DOI:10.11862/CJIC.2019.145
发表时间:2019
4

碳基及其复合材料SERS性能的研究进展

碳基及其复合材料SERS性能的研究进展

DOI:10.3969/j.issn.1001-9731.2019.06.001
发表时间:2019
5

Ordinal space projection learning via neighbor classes representation

Ordinal space projection learning via neighbor classes representation

DOI:https://doi.org/10.1016/j.cviu.2018.06.003
发表时间:2018

陈霏的其他基金

批准号:61501323
批准年份:2015
资助金额:23.00
项目类别:青年科学基金项目
批准号:31401240
批准年份:2014
资助金额:25.00
项目类别:青年科学基金项目

相似国自然基金

1

基于压缩感知的鲁棒性语音情感识别研究

批准号:61203257
批准年份:2012
负责人:张石清
学科分类:F0605
资助金额:24.00
项目类别:青年科学基金项目
2

基于频率弯折小波和DZCPA特征的抗噪语音识别

批准号:60472094
批准年份:2004
负责人:张雪英
学科分类:F0111
资助金额:18.00
项目类别:面上项目
3

基于压缩感知的语音信号建模与编码技术研究

批准号:61072125
批准年份:2010
负责人:陈砚圃
学科分类:F0111
资助金额:30.00
项目类别:面上项目
4

人耳听觉机理与语音压缩编码中应用

批准号:68972018
批准年份:1989
负责人:钱亚生
学科分类:F0101
资助金额:4.00
项目类别:面上项目