Complex scene text recognition is one of the hot spots in current text recognition field. Presently, one of the bottleneck problems in accurate scene text recognition is the lack of depth information. Since almost none of the existing scene text image datasets contains depth information, nearly all of the scene text recognition methods are restricted in 2-dimensional scene space, which results in relatively low text recognition accuracy. By using depth information, the proposed method discusses the problem of complex scene text recognition in both 2-dimensional and 3-dimensional scene space. Specifically, the research contains the following three aspects: (1) collect 3-dimentional scene text images each of which has depth information, and then design scene text recognition method in 3-dimensional space by exploiting the depth information. (2) Using the depth information of 3-dimensional scenes, estimate the depth of a 2-dimensional scene image based on the example learning methods. (3) Combined with depth information, the 3-dimensional trajectory of a character in the 2-dimensional scene is restored to perform character distortion rectification and recognition. The research explores methods of 3-dimensional scene text recognition based on depth information, and verifies the effectiveness of depth information for 2-dimensional scene text recognition. The research is meaningful to the development of text recognition and computer vision. Currently, the author has published 7 SCI/EI indexed papers in text recognition related fields, providing good research foundation for future works.
复杂场景的文字识别是目前文字识别领域的研究热点之一。现阶段制约场景文字识别准确性的瓶颈因素之一在于深度信息的缺乏。由于现有的场景文字图像数据基本均不包含深度信息,几乎所有的场景文字识别方法都局限在二维场景空间进行,文字识别的准确性不高。本课题拟利用深度信息,对二维空间、以及三维空间的复杂场景文字识别问题展开研究。具体的研究内容如下:(1)采集具有深度信息的三维场景文字图像,利用场景深度信息设计三维场景文字识别方法;(2)通过基于样本学习的方法,利用已知的三维场景深度对二维场景进行深度估计;(3)结合场景深度,将二维场景中的字符恢复三维轨迹,以实现字符的变形纠正及识别。本课题探索基于深度信息的三维场景文字识别方法,验证深度信息对于二维场景文字识别的有效性。该课题的研究对于文字识别、机器视觉等领域的发展有着重要意义。目前申请人已在文字识别相关领域发表SCI/EI论文7篇,积累了良好的研究基础。
本项目以复杂场景下的文字定位和文字识别为对象进行研究。复杂场景下文字特征的提取易受干扰且不稳定,不利于文字区域的定位和识别。本项目的研究内容包括:1)利用文本区域的周期性,提取有效特征来描述场景图像的文字区域,以更好地进行文字定位。2)利用子空间平滑的方法变换文字特征,以提高文字识别的准确性。3)利用生成模型和有限样本模拟文字特征的概率分布,并用辨别模型产生文字类别的特征,以降低文字识别中的类间干扰和类内干扰。以上研究的成果已发表SCI论文3篇,授权专利1项。
{{i.achievement_title}}
数据更新时间:2023-05-31
内点最大化与冗余点控制的小型无人机遥感图像配准
居住环境多维剥夺的地理识别及类型划分——以郑州主城区为例
基于细粒度词表示的命名实体识别研究
基于协同表示的图嵌入鉴别分析在人脸识别中的应用
适用于带中段并联电抗器的电缆线路的参数识别纵联保护新原理
基于文字对称性与场景上下文信息的自然场景文字检测研究
基于图模型的场景文字与叠加文字提取识别技术研究
复杂场景图像中维吾尔文字的定位与识别技术研究
基于深度学习的复杂场景下人体行为识别研究