基于深度信息的复杂场景文字识别研究

基本信息

批准号：61501192

项目类别：青年科学基金项目

资助金额：19.00

负责人：李南希

学科分类：

依托单位：华南师范大学

批准年份：2015

结题年份：2018

起止时间：2016-01-01 - 2018-12-31

项目状态：已结题

项目参与者：梅晓勇,梁瑾,韩后,左宪枝,陈雨

关键词：

文字识别字符分类手写体识别图像文字文字分割

结项摘要

Complex scene text recognition is one of the hot spots in current text recognition field. Presently, one of the bottleneck problems in accurate scene text recognition is the lack of depth information. Since almost none of the existing scene text image datasets contains depth information, nearly all of the scene text recognition methods are restricted in 2-dimensional scene space, which results in relatively low text recognition accuracy. By using depth information, the proposed method discusses the problem of complex scene text recognition in both 2-dimensional and 3-dimensional scene space. Specifically, the research contains the following three aspects: (1) collect 3-dimentional scene text images each of which has depth information, and then design scene text recognition method in 3-dimensional space by exploiting the depth information. (2) Using the depth information of 3-dimensional scenes, estimate the depth of a 2-dimensional scene image based on the example learning methods. (3) Combined with depth information, the 3-dimensional trajectory of a character in the 2-dimensional scene is restored to perform character distortion rectification and recognition. The research explores methods of 3-dimensional scene text recognition based on depth information, and verifies the effectiveness of depth information for 2-dimensional scene text recognition. The research is meaningful to the development of text recognition and computer vision. Currently, the author has published 7 SCI/EI indexed papers in text recognition related fields, providing good research foundation for future works.

复杂场景的文字识别是目前文字识别领域的研究热点之一。现阶段制约场景文字识别准确性的瓶颈因素之一在于深度信息的缺乏。由于现有的场景文字图像数据基本均不包含深度信息，几乎所有的场景文字识别方法都局限在二维场景空间进行，文字识别的准确性不高。本课题拟利用深度信息，对二维空间、以及三维空间的复杂场景文字识别问题展开研究。具体的研究内容如下：（1）采集具有深度信息的三维场景文字图像，利用场景深度信息设计三维场景文字识别方法；（2）通过基于样本学习的方法，利用已知的三维场景深度对二维场景进行深度估计；（3）结合场景深度，将二维场景中的字符恢复三维轨迹，以实现字符的变形纠正及识别。本课题探索基于深度信息的三维场景文字识别方法，验证深度信息对于二维场景文字识别的有效性。该课题的研究对于文字识别、机器视觉等领域的发展有着重要意义。目前申请人已在文字识别相关领域发表SCI/EI论文7篇，积累了良好的研究基础。

项目摘要

本项目以复杂场景下的文字定位和文字识别为对象进行研究。复杂场景下文字特征的提取易受干扰且不稳定，不利于文字区域的定位和识别。本项目的研究内容包括：1）利用文本区域的周期性，提取有效特征来描述场景图像的文字区域，以更好地进行文字定位。2）利用子空间平滑的方法变换文字特征，以提高文字识别的准确性。3）利用生成模型和有限样本模拟文字特征的概率分布，并用辨别模型产生文字类别的特征，以降低文字识别中的类间干扰和类内干扰。以上研究的成果已发表SCI论文3篇，授权专利1项。

项目成果

DOI：{{i.doi}}

发表时间：{{i.publish_year}}

暂无此项成果

数据更新时间：2023-05-31

其他相关文献

DOI：10.3778/j.issn.1002-8331.1911-0012

发表时间：2020

DOI：10.6041/j.issn.1000-1298.2022.07.022

发表时间：2022

DOI：10.3724/SP.J.1089.2019.17435

发表时间：2019

DOI：10.7544/issn1000-1239.2019.20190386

发表时间：2019

DOI：10.16798/j.issn.1003-0530.2020.01.008

发表时间：2020

李南希的其他基金

相似国自然基金

基于文字对称性与场景上下文信息的自然场景文字检测研究

批准号：61702160

批准年份：2017

负责人：巫义锐

学科分类：F0605

资助金额：25.00

项目类别：青年科学基金项目

基于图模型的场景文字与叠加文字提取识别技术研究

批准号：61271434

批准年份：2012

负责人：王伟强

学科分类：F0116

资助金额：76.00

项目类别：面上项目

复杂场景图像中维吾尔文字的定位与识别技术研究

批准号：61562058

批准年份：2015

负责人：许亚美

学科分类：F0605

资助金额：37.00

项目类别：地区科学基金项目

基于深度学习的复杂场景下人体行为识别研究

批准号：61503141

批准年份：2015

负责人：吴秋霞

学科分类：F0605

资助金额：22.00

项目类别：青年科学基金项目

基于深度信息的复杂场景文字识别研究

{{i.achievement_title}}

暂无此项成果

其他相关文献

针对弱边缘信息的左心室图像分割算法

基于改进LinkNet的寒旱区遥感图像河流识别方法

信息熵-保真度联合度量函数的单幅图像去雾方法

基于卷积神经网络的JPEG图像隐写分析参照图像生成方法

TVBN-ResNeXt:解决动作视频分类的端到端时空双流融合网络

李南希的其他基金

相似国自然基金