As a key foundation of image understanding and interpretation at a higher level, image classification and recognition has been playing a significant role in most computer vision systems. However, compared to human vision system (HSV), computer is much weaker in classifying and recognizing an image and cannot well meet the needs of many practical applications. Dictionary learning for sparse representation not only has biological supports (e.g., sparse coding mechanism of HSV), but also is very suitable for designing more general and powerful image descriptors and more discriminative image classifiers. To this end, inspired by the multiple-stage recognition pathway in the visual cortex, we will propose a novel framework of dictionary deep learning which could percept different-level visual information from images via multiple-stage dictionary learning. In the different stages of dictionary deep learning, we will design a latent dictionary learning method to adaptively and automatically build the relationship between dictionary atoms and class labels, enhancing the representation and discrimination ability of dictionaries. Meanwhile, in the final stage of image classification, we will learn the fusion of multiple-measurement discriminative dictionaries to well exploit the discrimination embedded in the multiple measurements. Based on visual cortex and deep learning, we try to mimic the multiple-stage human visual perception via the proposed hierarchical and deep learning of latent discriminative dictionaries. And we could propose a novel and effective image classification and recognition model and reduce the gap of recognition performance between computer and HSV, which will deepen the human understanding of HSV and have an important research value and social significance.
图像分类识别是对图像进行更高层理解和分析的基石,它在几乎所有计算机视觉系统中都扮演重要角色。但与人类视觉相比,目前计算机图像分类识别能力还比较落后,不能完全满足实际应用需求。字典学习符合人眼视觉机理,同时便于设计判别性更强的分类器以及适用性和鉴别性更强的图像描述。鉴于此,本项目将根据视觉皮层模型的层级感知机制,提出一种具有视觉信息层级感知能力的字典深度学习分类框架;在字典深度学习的不同层级,我们将学习隐式判别字典来自动建立字典元素与类别的关联关系,增强字典的表示和鉴别能力;同时在最终分类阶段提出多量测判别字典融合学习来有效利用数据的多量测信息。通过借鉴视觉皮层模型和深度学习,我们将尝试通过隐式判别字典的深度多层级学习来模拟人眼视觉感知,产生新的、具有更高性能的图像分类识别器,缩小计算机在模式识别方面和人类的差距。项目研究将大大加深人们对大脑视觉系统的理解,具有重要的学术研究价值和社会意义。
图像分类识别是对图像进行更高层理解和分析的基石,它在几乎所有计算机视觉系统中都扮演重要角色。但与人类视觉相比,目前计算机图像分类识别能力还比较落后,不能完全满足实际应用需求。字典学习符合人眼视觉机理,同时便于设计判别性更强的分类器以及适用性和鉴别性更强的图像描述。项目组研究了受视觉皮层模型启发的深度多层级图像分类算法。在深度多层级学习方向研究了大间隔深度学习框架、局部自适应深度学习和从底向上的多层字典学习框架;在挖掘字典元素类别关系的判别字典学习方向,研究了设定类别通用和类别专用混合的判别字典学习方法、自适应估计字典元素类别关系的算法、分析合成字典、半监督字典和鲁棒字典学习模型;在多量测字典融合学习方向,研究了多特征联合字典表示学习的融合算法、多特征联合字典协同表示的模型、以及利用多核思想更加有效利用数据鉴别性的模型。 这些模型提出了具有更高性能的图像分类识别器,缩小计算机在模式识别方面和人类的差距。项目研究加深了人们对大脑视觉系统的理解,具有重要的学术研究价值和社会意义.项目团队针对项目内容进行了深入研究,发表学术论文31篇,其中包括《Pattern Recognition》、《IEEE Trans. Multimedia》、《Neurocomputing》等中科院二区SCI期刊论文9篇,其他SCI期刊论文4篇,1篇章节论文,包括AAAI 2017, AAAI 2016, ICML 2016, ICME 2015等在内的国际/国内会议论文17篇,申请相关发明专利4项。
{{i.achievement_title}}
数据更新时间:2023-05-31
粗颗粒土的静止土压力系数非线性分析与计算方法
基于SSVEP 直接脑控机器人方向和速度研究
内点最大化与冗余点控制的小型无人机遥感图像配准
中国参与全球价值链的环境效应分析
基于公众情感倾向的主题公园评价研究——以哈尔滨市伏尔加庄园为例
基于稀疏表示和字典学习的深度图像序列人体行为识别
基于深度特征语义感知视觉字典学习的联合图像分类及对象定位方法研究
基于鲁棒判别式约束的深度字典学习及人脸识别应用研究
基于深度学习的细粒度图像分类识别方法研究