With the rapid development of deep learning, convolutional neural networks (CNN)-based methods keep refreshing the precision record of monocular image depth map prediction tasks, and have become the mainstream method in this field. However, three main problems generally exist in CNN-based approaches: (1) Structure information in depth maps is neglected causing the generated depth maps look unrealistic one way or another; (2) Limited size of depth-annotated training set causes poor generalization power of the prediction model; (3) Estimated depth maps are hard to manipulate by user interactive refinement. In this project, we adopt the mechanism of Generative Adversarial Networks (GANs) for the potential solution of these problems, which therefore forms a more complete framework for learning-based depth estimation. First, cascaded with a depth estimation network, we build a determinative network which is trained with ground truth and estimated depth maps respectively as positive and negative samples, and hence learns high level dependencies among pixels of a real depth map. This information is then feedback to the depth estimation network through adversarial training which force the depth estimation network to generate depth maps which fulfills these dependencies. Second, we train two GANs with two independent depth clues first on the annotated training set, then use them to predict the depth maps of a large pool of unannotated image set. The predicted depth maps with highest confidence (determinative network outputs high probability that the generated depth map is real) of one GANs are added to the annotated training pool and are used to fine-tune the other GANs incrementally. With the accumulation of these “pseudo-annotated training data”, two GANs improve their performances simultaneously. Finally, we propose an efficient interactive depth map refinement approach which enables user to modify inaccurate predicted depth maps efficiently and enables incremental learning for the network to learn user intensions.
随着深度学习的飞速发展,基于卷积网络的模型不断刷新着单目图像深度估计的精度记录,逐渐成为该领域的主流方法。但这类方法存在三个主要问题:忽视深度图自身结构,估计深度图真实感差;有标定训练集概括性差,模型范化能力弱;深度图生成后难以修改。本项目引入生成式对抗网络的思想分别解决这些问题,以形成一个较为完整的深度估计解决方案:首先,构建判别模型,以真实和估计的深度图分别为正负样本学习真实深度图的合理性规则,并通过对抗训练反馈给深度估计模型,使其能产生符合这些规则的深度图。然后,两个独立的深度估计网络分别估计大量无标定图像的深度,并各自选取高置信度结果作为有标定样本训练另一个网络,从而积累训练样本,提升范化能力。最后,使判别网络作用于局部深度,提示用户通过稀疏标定低置信度部分深度,结合生成和对抗网络的输出设计优化函数,将稀疏标注传播至全图,并增量训练网络,使其逐渐学习生成满足用户意图的深度图。
随着深度学习的飞速发展,卷积神经网络模型逐渐成为从图像和视频中估计3D信息的的主流方法。但是,此类方法存在的重要问题包括:忽视深度图自身结构,估计深度图真实感差;对于非刚体目标的3D重建具有表面形变难以描述和复杂度高等问题。受到对抗生成网络的启发,本项目研究使用对抗生成网络改进基于卷积神经网络模型的问题,主要研究内容包括:1)构建对抗网络模型,反馈给深度估计模型,使其能产生具有真实感深度图;2)用基于数据驱动的办法得到目标表面点所隐藏的几何表示,从而降低非刚体目标形变的描述难度;3)提出一种掩码感知的深度强化学习方法来改善人体检测结果;4)研究了对抗生成网络在人脸识别中形成对抗攻击的有效性;5)研究了深度卷积神经网络在图像分割,目标检测和医学影像中的其他应用。研究成果包括发表SCI收录期刊论文6篇,EI收录国际会议论文2篇,授权国家发明专利1项。
{{i.achievement_title}}
数据更新时间:2023-05-31
跨社交网络用户对齐技术综述
基于SSVEP 直接脑控机器人方向和速度研究
拥堵路网交通流均衡分配模型
基于公众情感倾向的主题公园评价研究——以哈尔滨市伏尔加庄园为例
城市轨道交通车站火灾情况下客流疏散能力评价
基于深度卷积生成式对抗网络的超高空间分辨率遥感图像场景分类方法研究
基于生成式对抗网络的旋翼高速运动图像增强方法研究
基于生成式对抗网络的高光谱遥感图像去云方法研究
基于深度卷积生成对抗网络的3D-MRI图像超分辨率重建研究