3D video has been receiving much attentions and expectations due to its having quite a few merits such as realistic feeling, telepresence, and so on. Highly efficient coding is a key factor that affects its maturity and application and it is also an international hot research topic and frontier. Conventional 3D video coding methods mainly emphasize on the compression efficiency and seldom simultaneously address the comprehensive performance of compressed stream, including spatio-temporal-view random access and interactivity, semantic and content -based edition, retrieval and operation, transmission flexibility and robustness over networks, user experience and visual comfort, and so on. With the maturity of 3D display technologies and the advent of diverse 3D display devices, networking and intelligent video applications as well as individual user demands, conventional 3D video coding methods can hardly meet practical requirements of network transmission and user applications. In this project, by analyzing the visual and semantic features within 3D video and the human perception theory, transmission-oriented theories and methods for 3D video representation and coding are to be addressed. The following key problems will be mainly studied and resolved: semantic modeling and hybrid video content segmentation according to visual interest and perception importance; hybrid and differential coding schemes and methods for different visual data to optimize overall comprehensive coding performance; perception-optimized bit-stream reassembling and optimization techniques. Breakthroughs on the general solution, novel coding theories as well as key algorithms are expected, which will be helpful to promote the maturity of 3D video technology and its extensive applications in 3D film, 3D television, machine vision, remote medical, military, aerospace, and so on.
3D视频由于具有真实感、临场感等诸多优点而备受关注与期待,高性能编码是其成熟推广的关键,也是当前国际研究热点和前沿。传统方法主要追求高效压缩而较少协同考虑压缩码流的综合性能,包括时空视多维随机接入与交互性、语义和内容基的编辑检索与操作、网络传输自适应性和鲁棒性、用户体验质量和观看舒适度等。随着3D显示技术日趋成熟和显示终端日益丰富、视频应用的网络化和智能化以及用户需求个性化,传统编码方法将难以满足实际应用和网络传输要求。本项目以传输为导向,以视觉语义特征和感知分析为驱动,探索3D视频表示与编码新理论和新方法,解决融合视觉语义特征和感知重要性的统计建模与分层式视频内容分割和表示方法、面向视觉兴趣的差异化编码策略、基于感知和舒适度优化的码流重组与优化等关键问题,提出综合性能优化的解决方案、创新理论和关键算法,推动3D视频技术的早日成熟和在3D影视、机器视觉、远程医疗、航天军事等领域的广泛应用。
3D视频是视频技术的一个重要发展趋势,在3D电视、虚拟现实、机器视觉、远程医疗等领域将有着广泛的应用前景。本项目针对现有3D视频编码方法压缩后码流综合性能不够理想、难以满足实际应用中网络传输和用户需求的不足,重点围绕融合视觉语义和感知重要性的分层式视频内容分割与表示、面向视觉兴趣的差异化编码策略、基于感知和舒适度优化的码流重组与优化等内容进行了深入研究,在基于视觉特征建模的语义分析、图像视频分割方法、区域分类和差异化编码策略、立体匹配以及相关视觉对象形状编码等关键问题上取得积极进展并有多处创新,主要包括:1)视频语义分析与内容分割方面,提出了基于统计建模的图像分割、基于超像素和图割的图像分割以及基于深度学习的图像分割等创新理论和关键算法;2)基于视觉兴趣的差异化编码方面,提出了基于分割和自适应支撑窗的立体匹配算法、基于轮廓和链码表示的视觉对象形状编码和基于空时预测的视觉对象形状编码等关键算法;3)结合修复理论的编码压缩方面,研究了基于深度学习的图像修复方法,提出了基于卷积网络结构和生成对抗网络结构的图像修复关键算法;4) 图像质量评价与感知驱动的码流优化与传输策略方面,研究提出了结合图像质量评价和感知优化的嵌入式率失真估计和基于率失真优化的码流截取与重装方法。相关研究成果已在IEEE TCSVT、IEEE TMM、Neurocomputing等国内外学术期刊发表或录用论文17篇,其中SCI期刊论文9篇,国内一级核心和EI检索期刊论文3篇,EI检索国际会议5篇。申请发明专利8项,其中已授权3项,获授权软件著作权2件。部分研究成果已面向智慧电力、智能驾驶等领域进行了成果转化,获得浙江省科技进步三等奖、宁波市自然科学论文优秀奖各1项。另有进一步的研究成果已完成,待发表。
{{i.achievement_title}}
数据更新时间:2023-05-31
基于多模态信息特征融合的犯罪预测算法研究
惯性约束聚变内爆中基于多块结构网格的高效辐射扩散并行算法
多空间交互协同过滤推荐
多源数据驱动CNN-GRU模型的公交客流量分类预测
采用深度学习的铣刀磨损状态预测模型
沉浸式3D全景视频的表示编码及传输
基于3D视觉注意的自由视点视频编码与传输
QoE驱动下的基于内容分析的3D视频感知编码研究
基于压缩感知的WMSN编码传输与视频重构技术研究