Stochastic grammars are a probabilistic extension of formal grammars, which are widely used in the domain of natural language processing and are also increasingly extended and applied to problems in the domain of computer vision over the last ten years. This project aims to extend stochastic grammars as general-purpose statistical relational models, thus unifying the existing extensions of stochastic grammars that are designed for different data types, as well as facilitating the application of stochastic grammars on additional data types and AI problems. Compared with existing statistical relational models, the proposed models based on stochastic grammars may achieve a better trade-off between expressiveness and computational complexity, which may provide a novel direction in solving this central problem in the domain of statistical relational learning. Based on the above research, this project further plans to study the application of stochastic grammars to traditional non-relational statistical modeling, hoping to break through the representational limitation of the classical probabilistic graphical models. The main subjects of this project include: general-purpose statistical relational and non-relational models based on stochastic grammars; the relationship between such models and existing statistical models in terms of expressiveness; fast and effective inference algorithms and unsupervised and weakly-supervised automatic learning algorithms for such models; and finally, based on such general-purpose models, conducting preliminary research of novel applications of stochastic grammars on new domains and problems.
随机文法是形式文法的概率化扩展,在自然语言处理领域被广泛应用,并且近十年间在计算机视觉领域的扩展应用也得到了很大发展。本项目计划将随机文法扩展为一种通用的统计关系模型,从而统一现有工作对随机文法基于不同数据类型所做的扩展,并促进随机文法在更多的数据类型和人工智能问题上的应用。相比于已有的统计关系模型,基于随机文法的统计关系模型在表达能力和计算复杂度之间有望达到更好的折中,从而为解决统计关系学习领域的这个核心问题提供一个新的研究角度。以此为基础,本项目计划进一步研究随机文法在传统非关系型统计建模上的应用,以期突破传统概率图模型在表达能力上的限制。本项目的主要研究内容包括:基于随机文法的关系型和非关系型通用统计模型;所提出模型与现有的统计模型之间在表达能力上的关系;所提出模型的快速有效的推理算法和非监督、弱监督自动学习算法;最后,基于所提出通用模型对随机文法在新的领域和问题上的应用进行预研。
随机文法是形式文法的概率化扩展,在自然语言处理领域被广泛应用,在计算机视觉、概率建模等其他领域也有不少应用。我们将随机文法扩展为一种通用的统计关系模型:随机与或文法,该模型统一了已有工作对随机文法基于不同数据类型(例如语言、图像、事件等)所做的扩展,研究了其推理复杂性,并理清该模型与现有经典模型在表达能力上的关系。基于该通用模型,我们提出了一种基于随机文法的非关系型矢量数据统计模型:潜在依存森林模型,并研究了其推理和学习算法;同时,我们也深入研究了另一种基于随机文法的非关系型统计模型和积网络的最大后验推理问题。我们还系统研究了随机文法模型的自动学习算法,着重于研究无监督学习,从生成式、判别式、以及两者结合的角度提出了多个新方法。最后,我们研究了新兴的神经网络方法与传统随机文法的结合,提出了隐向量文法等新模型。
{{i.achievement_title}}
数据更新时间:2023-05-31
粗颗粒土的静止土压力系数非线性分析与计算方法
基于SSVEP 直接脑控机器人方向和速度研究
中国参与全球价值链的环境效应分析
基于公众情感倾向的主题公园评价研究——以哈尔滨市伏尔加庄园为例
基于细粒度词表示的命名实体识别研究
扩展生长曲线模型的统计推断
文法演化的模型理论
大规模复杂网络的通用管理接口模型及其扩展机制
随机效应混合模型的统计推断