On-line handwriting input as a natural, convenient has been attached great importance to and has been widely used. However, little work has been done in this area. Through analysis of the unique shape and writing styles of Uyghur characters, this project research an effective approach for online handwritten Uyghur word recognition based on a lexicon-driven, integrated segmentation and recognition have been presented. Word recognition problem is transformed into matching optimization problems between the dictionary entry and the handwritten word image. The first step, after removing delayed strokes from the handwritten words, potential breakpoints are detected from concavities and ligatures by temporal and shape analysis of the stroke trajectory. Reconstruct delayed strokes and obtained a sequence of primitive segments. Then, by combining adjacent fragments, create candidate segmentation grids. In the second step, using lexicon-driven approach, combined with character recognition information, geometric information and dictionary information into path matching procedure in the word recognition system. Then using the confidence transformation method convert the similarity scores into probabilities, such that the tuning of weighting parameters becomes easier. Dynamic matching between characters in the lexicon entry and segment(s) of the input word image is used to ranking the lexicon entries in order to get best match. The research of recognition techniques for online handwritten Uyghur characters has a far-reaching meaning about developing the information technology and national culture of specific ethnic group.
联机手写输入做为一种自然、方便的输入方法,已经得到了高度重视并广泛应用。然而,联机维吾尔文手写识别研究至今还非常少见。通过分析维吾尔文字母与单词自身的结构和书写特点, 本项目研究基于词典驱动的、集成切分与识别的联机手写维吾尔文单词识别框架和方法。系统中把单词识别问题转化为一个词典中的词条与手写单词图像匹配的优化问题。首先,去掉单词中的附件部分后,通过分析主要笔划书写轨迹的形状,找出潜在的过分割点并合并被切分成的基本块与对应它的附加部分,得到基本字母片段序列。对相邻的基本片段进行组合形成切分候选网格。然后,采用词典驱动的方法,将字母识别信息、几何信息和词典信息一起加入到单词识别系统的路径匹配过程。其中,采用置信度转换的方法,将分类器的输出转换成概率的形式,使参数调整更为方便;利用动态规划算法实现单词识别过程中的最优路径匹配选择,得到最优识别结果。本研究成果有助于促进少数民族地区的信息化建设步
手写输入法是一种智能,高效的输入法。本项目结合联机手写单词的特点,有效的研究了集成切分与识别的单词识别方法,并初步的研究了整体单词识别方法。集成切分与识别的单词识别包括预处理,字母识别,基于识别的切分,路径搜索模块。预处理中,提出了一种快速有效的手写单词倾斜矫正算法,即基于区块轨迹重心的倾斜矫正。字母识别中,首次利用深度学习(卷积神经网络)方法对128个字母进行识别研究。切分中,提出一种基于局部轨迹信息的单词切分方法。根据联机手写轨迹局部信息来进行过切分,然后用各种条件把分割出来的切分块进行重合,最后形成字母。动态规划算法用于单词识别过程中的最优路径搜索。除此之外,本项目首次利用循环神经网络RNN结合连接式时序分类CTC进行研究,初步建立了端到端联机手写单词识别系统。为了减轻手写识别研究中的数据缺少问题带来的影响,提出了手写轨迹随机变长的联机手写数据增强算法。搭建了三种平台,分别是维吾尔文字母识别实验平台和维吾尔文单词切分实验平台和维吾尔文单词识别实验平台。
{{i.achievement_title}}
数据更新时间:2023-05-31
一种光、电驱动的生物炭/硬脂酸复合相变材料的制备及其性能
硬件木马:关键问题研究进展及新动向
居住环境多维剥夺的地理识别及类型划分——以郑州主城区为例
基于ESO的DGVSCMG双框架伺服系统不匹配 扰动抑制
基于细粒度词表示的命名实体识别研究
联机手写维吾尔文基础数据库及识别方法研究
基于信息融合的维吾尔文联机手写单词识别技术研究
联机手写新疆维吾尔文字符识别研究
维吾尔文手写签名识别与验证的关键技术研究