Multi-label image contains multiple visual semantic objects, and these objects usually have different scale, position, gesture, and category. The recognition of multi-label image is a very important research direction. Although the deep learning algorithm has achieved a leading progress, but now it is more suitable for the recognition of single-label images. Few multi-label image recognition algorithm were proposed to connect the deep learning process with the traditional object extraction method, which failed to utilize the inherent deep feature maps generated during the deep learning, thus lack of initiative and effectiveness. Therefore, we propose to study the multi-label image recognition method based on deep image feature maps. By obtaining an integrated multi-label image classification model, we can complete multi-label feature learning, analysis, and classification tasks under an unified framework without extra single label object extraction steps. Firstly, visual saliency analysis is proposed based on the inherent multi-scale feature maps during the deep learning process, which results in multiple saliency areas. Then, a comprehensive semantics reconstruction procedure is proposed based on regional image characteristics and spatial pyramid coding algorithm. Finally, the deep learning classification process is introduced. What's more, the feature map analysis and multi-label image recognition process are constraint and optimized by regularizing the learning of deep neural network based on multi-loss function. The research results not only have important theoretical significance in the field of computer vision but also have broad application prospects.
多标签图像包含多个尺度、位置、姿态、类别各不相同的视觉语义对象,对其进行识别是一个重要的研究方向。目前领先的深度学习算法只适用于单标签图像识别,最近有论文将传统的对象区域提取方法串联深度学习过程,用于多标签图像识别,但其没有利用深度学习过程中内在的特征图谱,识别效率低、效果不够好。因此,本课题将基于深度图像特征图谱,研究多标签图像识别技术,获得一体化的多标签图像分类模型,在一个框架下统一完成多标签特征图谱学习、分析和分类任务,无需额外的对象预提取过程。首先,对深度学习中内在多尺度特征图谱进行视觉显著性分析,得到多个显著性区域;然后,基于区域局部特征图谱与空间金字塔编码进行综合语义重构;最后进行深度学习分类。同时在深度特征图谱分析和深度学习分类过程中,还采用多损失函数分别进行约束与优化,实现深度神经网络规则化学习。本课题研究成果不仅具有重要的理论意义,在计算机视觉相关领域也具有广泛的应用前景
本项目基于深度学习框架,利用其很好的单标签图像分类特性,提出新的多标签图像识别模型。本项目首先基于多尺度深度图像特征图谱进行显著性分析,得到多目标显著性区域,从而把显著性检测方法用于多标签图像分类任务上。相对于一般对象检测方法得到大量的候选区域,基于特征图谱进行多尺度的显著性检测能够极大地避免基于整幅图像的稠密对象检测,提高算法的运行效率。同时,在获得显著性区域后,本项目将利用空间金字塔编码模式进行进一步综合语义重构,对图像内容进行更加全面、综合地表征。而后,我们提出最小化多个损失函数来共同优化深度卷积神经网络的共有部分。基于多任务的深度学习,一方面能够高效优化卷积神经网络参数,防止因为拟合一种图像属性而引起的过拟合现象,另一方面仅利用一个神经网络流程,就可以同时得到多个图像属性,可以有效的从多个方面进行多标签图像识别。
{{i.achievement_title}}
数据更新时间:2023-05-31
玉米叶向值的全基因组关联分析
正交异性钢桥面板纵肋-面板疲劳开裂的CFRP加固研究
硬件木马:关键问题研究进展及新动向
基于SSVEP 直接脑控机器人方向和速度研究
小跨高比钢板- 混凝土组合连梁抗剪承载力计算方法研究
基于层次深度网络混合模型的图像识别技术研究
基于多视角的网络暴力敏感图像识别技术研究
基于多标签语义本体的图像深度哈希学习新方法研究
基于深度特征学习的翻拍图像检测技术研究