In recent years, the semantic integration (SI) has become a research hotspot in the field of electronic market and office automation due to the scalability and reusability of semantic document. However, SI is often accompanied by defects such as semantic loss and semantic non-automatic processability, which leads to ambiguity in semantic understanding among semantic entities and semantic cannot interact freely across scenes. Semantic loss refers to the loss of or change in the meanings of words or meanings of semantic relations when documents are exchanged between different contexts, which leads to errors in comprehension of semantic documents between semantic entities. Therefore, how to ensure the consistency of the semantic understanding between different entities in different contexts is a very important issue. Semantic non-automatic processability means that a semantic document generated in a context cannot be analyzed, understood and processed by other contexts. So automating semantic processing is also a challenging research question. To this end, this topic proposes a cross-context semantic document exchange method, Tabdoc method, as a new strategy to achieve semantic interoperability. First of all, the existing problems of the semantic document representation are analyzed. Based on the context-free grammar, a context-independent semantic document representation method is established, and a semantic document editor for autonomous document design and convenient semantic embedding in any context is designed. Secondly, according to the needs of cross contexts, in order to reduce the semantic loss, we design a semantic extraction algorithm that guarantees the complete semantic extraction. Finally, we achieve the correct semantic understanding under the premise of ensuring the complete and consistent semantic information by proposing a rule-based semantic document comprehension algorithm. The research of this topic will provide important reference for the future large-scale semantic analysis technology.
语义集成(SI)因语义文档具有可扩展性、可重用性等特点,成为近期电子市场、办公自动化等领域的研究热点。然而,SI常伴随有语义缺失、语义不能自动化处理等缺陷,从而导致语义实体之间的语义理解产生歧义,并且语义无法跨场景自由交互。因此,如何确保不同语境各实体之间语义理解的一致性并实现语义处理自动化是一个非常重要的问题。为此,本课题提出了一种跨语境的语义文档交换方法(Tabdoc),作为实现语义互操作的新策略。首先,分析现有的语义文档表达方法存在的问题,基于上下文无关文法,建立了一种与环境无关的语义文档表达方法,并设计了在任何语境下通用的、方便文档设计和语义嵌入的语义文档编辑器;其次,根据跨语境需求,为减少语义丢失,设计能保证语义完整提取的语义提取算法;最后,在保证语义信息完备和含义一致的前提下实现正确的语义理解,设计基于规则推理的语义文档理解算法。本课题为未来大规模语义分析技术提供重要的借鉴。
当前,互联网平台的演变趋势呈现由孤立、分散型向协同、集成型发展。推动这一演变的是在信息融合、集成互联等技术领域的进步。跨场景语义一致性交互是指在一种语境中表示、编辑和发送的语义信息,不仅可以被相同语境下的人和计算机阅读和理解,而且可以被不同语境的信息接收者共同理解的过程。本项目是对跨场景语义交互中语义信息的表达、语义分类、信息抽取及理解等关键问题进行深入研究,达到了改进已有方法、推动实际应用的目的。主要研究内容包括:(1)实现了一种跨场景的语义协同交互方法,通过设计新的语义文档表达方法和语义理解策略,实现跨语境异构信息系统之间一致的、无歧义的语义互操作;(2)实现了基于语义消歧和强关联分析机制的语义信息分类方法,从而解决不同语境间信息交互过程中存在的多义词和同义词对信息分类的干扰问题;(3)实现了基于图神经网络和深层长短时记忆网络的语义成分联合提取方法,从而解决现有的语义成分提取方法中存在的“未充分考虑不同粒度信息”及“不同信息抽取模块间相互干扰”等问题。相关研究进展提升了现有跨场景语义一致性交互的能力。基于上述成果,项目组共计发表学术论文5篇,其中4篇重要期刊和1篇学术会议。
{{i.achievement_title}}
数据更新时间:2023-05-31
跨社交网络用户对齐技术综述
小跨高比钢板- 混凝土组合连梁抗剪承载力计算方法研究
基于公众情感倾向的主题公园评价研究——以哈尔滨市伏尔加庄园为例
面向云工作流安全的任务调度方法
响应面法优化藤茶总黄酮的提取工艺
面向跨相机跟踪的场景几何-语义联合理解与关联
跨场景的行人关联方法研究
跨场景大规模人群分析方法研究
复杂场景认知跨视域人员轨迹跟踪方法研究