Generic object recognition is one of the core tasks of visual scene understanding. In recent years, thanks to the development of deep learning technology and the prosperity of big Internet data, significant progress has been made. The current mainstream methods have even surpassed the performance of human beings visual system under some closed world settings..Aiming to tackle the problem of large scale object recognition in the open world, this project will take weakly supervised learning and transfer learning as its theoretical foundation, explore the complex correlation and representation mechanism between massive object categories, study the efficient and extensible classification theory for large scale recognition, and establish a framework of visual knowledge mining and transferring with the ability of cross scene generalization..Specifically, this project will address the following research issues. First, to characterize the complex cross-category correlation, we will study the hierarchical class representation model which conforms to the human perception mechanism, and propose a progressive recognition method with semantic attribute embedding. Second, we will conduct analysis on the statistical properties of the long tail distribution of data, study the cross-category zero-shot learning method based on shared attributes dictionary, and build an open world recognition framework with category scalability under realistic settings. Third, we will study the visual knowledge mining method that exploits the interaction between objects and scenes, construct a visual object concept database with self-update and dynamic evolution mechanism, and finally achieve appealing large scale object recognition under the guidance of cross-scene knowledge transfer..This project is expected to achieve both theoretical innovation and technological breakthroughs, and finally promote the practical application of visual object recognition.
通用物体识别是视觉场景理解的核心任务之一,近年来得益于深度学习技术的发展和互联网大数据的繁荣,取得了显著的进步,当前主流方法在一些封闭场景的数据集上甚至超越了人类视觉系统的识别能力。.本项目针对开放场景下大规模物体识别问题,以弱监督学习和迁移学习理论为基础,探索海量物体类别之间的复杂关联表示机理,研究大规模识别的高效可扩展分类理论,建立具有跨场景推广能力的视觉知识挖掘与迁移方法框架。.具体研究内容包括:针对复杂类间关联,研究符合人类感知机理的层级类别表示模型,提出语义属性嵌入的渐进式识别方法;分析数据长尾分布的统计特性,研究属性基元共享的跨类别零样本学习方法,建立真实场景下类别可扩展的开放式识别框架;研究场景与物体交互的视觉知识挖掘方法,构建具有自主更新与动态演化机制的视觉目标概念库,实现跨场景知识迁移引导下的大规模物体识别。.本项目预期取得理论创新与技术突破,促进视觉物体识别的实用推广。
本项目围绕开放场景下大规模物体识别这一关键核心问题开展深入研究,在类别层级关联机理建模、可扩展增量识别框架构建、场景与物体交互知识挖掘等方面取得了重要进展,主要工作如下:.(1)针对类别层级关联的内在形成机理,以属性作为关联不同物体类别的纽带,提出了属性知识引导的视觉特征学习与层级分类方法,显式解耦了视觉类别间的结构化分类规则,设计了高层类别与中层属性联合嵌入的多功能深度哈希学习框架,显著提升了二值编码特征的学习效率与精度;.(2)针对类别增量扩展的开放式识别框架,以知识作为数据的补充来引导模型学习,提出了一系列面向未知类增量识别的可扩展、可迁移、增量学习方法,建立了视觉数据空间与语义知识空间的映射关系,实现了视觉分类知识从已知域到未知域的迁移;.(3)针对场景与物体交互的视觉知识挖掘,以结构化场景图的构建作为基石,统筹刻画图像中的多维视觉概念元素(实体、属性、关系等),提出了上下文关系推理的场景物体检测、人类感知机理启发的层级视觉场景图生成等基础方法,有力支撑了“物体-->场景-->语言”递进式场景理解统一框架的构建。 .围绕上述工作,项目执行期间共发表/录用领域主流国际期刊和会议论文28篇(含CCF-A类论文12篇),其中国际期刊论文9篇(包括IEEE Trans. on PAMI论文1篇,IJCV论文2篇,IEEE Trans. on Image Processing论文1篇),会议论文19篇(包括IEEE CVPR论文4篇,ICCV论文4篇,ECCV论文2篇),并获得了IEEE CVPR2021 CLVision Workshop最佳论文奖1项。已发表论文Google Scholar引用606次,单篇最高172次。项目研究成果较为系统地建立了开放场景中大规模物体识别的系列理论与方法,引起了国际同行较为广泛的关注和跟进。项目成果申请并获授权国家发明专利1项、登记软件著作权1项。以上述工作为算法核心,分别获得了IEEE ICCV2019 WIDER视频行人检索竞赛亚军、CVPR2020 CLVision增量物体识别竞赛冠军。项目成果有效支持了课题组相关产业化项目的技术研发,促进了大规模物体识别与场景理解的广泛应用。
{{i.achievement_title}}
数据更新时间:2023-05-31
玉米叶向值的全基因组关联分析
正交异性钢桥面板纵肋-面板疲劳开裂的CFRP加固研究
硬件木马:关键问题研究进展及新动向
基于SSVEP 直接脑控机器人方向和速度研究
小跨高比钢板- 混凝土组合连梁抗剪承载力计算方法研究
基于稀疏表示的高效鲁棒大规模物体识别方法研究
基于视觉显著性的室内点云场景物体识别方法研究
开放场景下基于深度学习的时空信息融合行人再识别方法研究
基于基本形状体及其拓扑结构的点云场景物体识别方法研究