This project focuses on the tasks of segmentation, clustering and recognition of sequential patterns with “weak labels” which refers to sparse, unordered, multi-modal annotations. Given the great success of hidden Markov models (HMM) and its variations in supervised pattern recognition with sequential transcriptions, we study new algorithms for the semi-supervised learning of HMM by using nonnegative matrix factorization (NMF). Firstly, dimension reduction in NMF and hidden state estimation in HMM are integrated in an elegant way by tying their parameters. Hence, NMF provides a “top-down” global view by decomposing the whole data into recurring parts, while HMM provides a “bottom-up” local view by emphasizing the sequential property and the sequence generating process. The combination of the two would guide the learning process towards good fitting of the underlying data structures, especially during the lack of supervision. Secondly, weak labels, sequential data, NMF and HMM will be optimized in a joint framework to realize the effective interactions between the vector-space representations of weak labels and the dynamic graphical models of sequential data. Finally, the proposed algorithms will be implemented with new applications, such as computational modeling of language acquisition and activity recognition and understanding in video processing.
为解决稀疏标注、无序标注、多模态输入等“弱标注”条件下序列模式的分割、聚类与识别问题,并鉴于隐马尔可夫模型(Hidden Markov Models, HMM)及其变体在全标注条件下序列模式有监督识别中的巨大成功,本课题借助非负矩阵分解(Nonnegative Matrix Factorization, NMF)研究HMM半监督学习的新方法。首先,将NMF的降维问题与HMM的状态估计通过参数共享有机结合起来;由此,NMF对数据整体提供了“自上而下”的分解视角、HMM对每个序列提供了“自下而上”的生成视角;二者的有效联合对引导“弱标注”条件下的半监督学习合理反映数据内在结构是十分有益的。然后,建立弱标注、序列数据、NMF与HMM四位一体的联合模型,实现弱标注的矢量表示与序列数据的动态图模型之间的有效交互。最后,将所研究的模型和算法应用于语言获取的计算建模、视频场景和动作的识别与理解等新领域。
为解决弱标注条件下序列数据建模问题,本课题研究了非负矩阵分解(Nonnegative Matrix Factorization,NMF)、隐马尔可夫模型(Hidden Markov Models,HMM)和深度神经网络(Deep Neural Networks,DNN)的半监督学习新方法。将NMF的降维问题与HMM的状态估计有机结合,提高了HMM的无监督和半监督学习效果;将稀疏低秩NMF与DNN结合起来,实现了弱标注/无监督的矢量表示与序列数据的动态图模型之间的有效交互;改善了小样本条件下的深度循环神经网络和深度卷积神经网络,能有效应对缺少标注的机器学习场景。将所研究的模型和算法应用于语言获取的计算建模、未知噪声情况下的语音增强、骨传导语音质量改善、语音转换、伪装目标检测等领域,在半监督或无监督学习的情况下,相对于原有方法得到了效果上的提升。
{{i.achievement_title}}
数据更新时间:2023-05-31
粗颗粒土的静止土压力系数非线性分析与计算方法
主控因素对异型头弹丸半侵彻金属靶深度的影响特性研究
低轨卫星通信信道分配策略
中国参与全球价值链的环境效应分析
基于公众情感倾向的主题公园评价研究——以哈尔滨市伏尔加庄园为例
隐马尔可夫模型及其在基因结构变异中的应用
网络化控制系统随机时延的半连续隐马尔可夫模型研究
面向鲁棒分类的半监督学习新算法及应用研究
基于隐马尔柯夫模型的多目标跟踪算法的研究