As a core technology of human-computer interaction (HCI), early expression detection (EED) is of great significance for the development of artificial intelligence. Despite the importance, there are only a few methods for EED, and the current EED models suffer some critical issues. For example, they cannot handle sequential data and do not consider the non-linear structure of the data distribution. However, the data are usually provided sequentially in this big data era, and the expression data often lie in a nonlinear feature space. To address these problems, we are devoted to devising two different EED frameworks to deal with sequential data with non-linear distribution structure. The two frameworks have their respective advantages (one is flexible and the other one has unified objective function), and the user can choose either one accordingly. The key technologies of online learning for early expression detection are studied in this project. The research contents mainly include: 1) building a web-based facial expression database with large volumes and various backgrounds and settings. 2) designing an online model of multi-feature fusion to efficiently and effectively find a good representation for the expression given the sequential input data. Given the representation, proposing an online multi-instance learning method for EED and constructing a flexible EED framework. Taking the non-linear structure of the data distribution into consideration by incorporating the kernel method. 3) developing an online learning approach for a RNN-LSTM (long short-term memory recurrent neural network) based on the RankNet scheme to integrate the feature extraction and detection as an unified EED framework. This project provides new solutions to the EED research in this big data era, and fundamental techniques for improving emotional communication in HCI applications.
作为人机交互的一项核心技术,人脸表情预检测对人工智能的发展有着重要意义。由于其复杂性,国内外相关研究成果极少,且现有方法存在不能适用于数据流场景、未考虑数据的非线性分布结构等问题。本项目针对这些问题,基于在线学习,构造两个各具优势的表情预检测框架,具体研究内容包括:1)基于互联网,构建一个大型的表情视频数据库,改善目前表情数据库中视频数量少、拍摄环境单一等问题;2)研究设计在线多源特征融合算法,在数据流场景下高效而充分地挖掘表情信息。基于此,研究提出在线多示例学习方法,构造一个灵活的表情预检测框架,并融入核方法使模型能够挖掘数据的非线性分布结构;3)研究开发基于在线学习的循环神经网络,借鉴RankNet原理,解决特征提取和表情预检测模型分开设计,优化目标不一致问题,构建表情预检测统一框架。通过本项目的研究,将为大数据时代的表情预检测研究提供新的解决思路,为智能化人机交互提供更好的技术支持。
本项目针对数据流场景下的表情预检测问题进行了若干研究,主要围绕在线学习机制,结合稀疏表示理论和多示例学习方法等对表情预检测中的单调性预检测函数构建问题建立了相应的模型,完成了相关任务,并提出了系列算法,如下:.(1)针对现有的数据库普遍存在规模较小、或者拍摄环境固定且单一等问题,无法满足表情、对象变化较为复杂的应用需求,创建了一个大规模的基于互联网的人脸表情视频数据库。(2)提出了基于任意稀疏结构的在线多视角子空间学习算法,实现了在降维的同时获取表情视频的特征融合表示,并根据线上反馈在线更新模型参数。.(3)分别构建了基于多示例学习的无监督和基于图正则化的半监督表情预检测模型,完成了视频数据的非线性分布结构的挖掘,并实现了模型的在线更新。.(4)针对现有模型中特征提取和预检测建模分离且相互独立的情况,构建了基于循环神经网络在线学习的表情预检测模型,实现了优化目标的一致化。. 本项目的开展对解决表情预检测问题具有重要的借鉴意义,丰富和发展了表情预检测研究的理论体系,为解决各种应用中的表情预检测任务提供了更好的技术支持。.本项目总计发表论文8篇(第一作者论文6篇,通讯作者论文2篇),其中,SCI收录论文6篇(中科院一区TOP 4篇、中科院二区2篇,中科院一区TOP SCI期刊包括IEEE Transactions on Neural Networks and Learning Systems、IEEE Transactions on Cybernetics和International Journal of Neural Systems)。本项目共申请并受理中国发明专利10项(第一发明人3项),其中授权中国发明专利3项(第一发明人2项)。本项目共申请并授权软件著作权3项。
{{i.achievement_title}}
数据更新时间:2023-05-31
粗颗粒土的静止土压力系数非线性分析与计算方法
基于公众情感倾向的主题公园评价研究——以哈尔滨市伏尔加庄园为例
基于全模式全聚焦方法的裂纹超声成像定量检测
基于协同表示的图嵌入鉴别分析在人脸识别中的应用
基于图卷积网络的归纳式微博谣言检测新方法
基于统计学习的语音驱动人脸表情动画研究
基于超图和深度学习的人脸特征定位与表情分析
基于视觉感知的人脸表情运动控制学习及其线条画生成
基于人脸重建的表情不变三维人脸识别研究