Graphical models are powerful for representing conditional independence relationships among multiple variables. Numerous application examples of graphical models can be easily found in areas including bioinformatics, image processing, social science, control theory, and marketing analysis. Structure learning of graphical models has been intensely studied but still remains an open challenge. In this study, we discuss the structure learning problems in graphical model studies. Firstly, for testing conditional independence, two classes of distribution-free tests will be proposed, structure learning algorithms based on which can thereby be used for more general settings. Secondly, an undirected graphical model structure learning method will be developed for coping with the dimensionality limitation of conditional independence tests..In the first part, we focus on the conditional independence tests, which is a basic component of the constraint-based structure learning methods. To infer the conditional independence properties from observed data, classical methods heavily depend on certain distributional assumptions, which limit their application scopes. Many recent biological datasets show nonlinear and non-Gaussian relationships among variables. A more general conditional independence test is needed for these problems. For this reason, we investigate new conditional independence tests based on information theory criterions such as mutual information and conditional mutual information. These tests make no assumption on probability distributions of the variables or the functional form of the association. A new procedure will be proposed for reducing confounding effects of conditioned variables on the conditional association under study. .In the second part, we explore the application of the newly proposed conditional independence tests on different structure learning algorithms. Both directed and undirected graphical models are considered. Due to the “curse of dimension", conditional independence tests are limited by the number of variables that they can handle. To make up for this weakness, a hybrid method for structure learning of graphical models, which combines the results of independence tests and the optimization of a score function will be developed. The advantage of the proposed score function is that it is distribution-free..The last part of this study is devoted to applying our algorithms to solving some real word problems, e.g., learning gene regulation networks, finding signal transduction pathways, etc. We are interested in both validating the existing hypotheses given by the biologists and looking for new promising correlations between different variables by exploring biological datasets.
概率图模型是不确定性知识表达和推理领域的一种有效的理论模型,是一个有力的多变量而且变量关系可视化的建模工具,其在实际生产生活中有着广泛的应用前景。本课题的主要研究内容在于:首先,研究对输入数据的概率分布不作假设或限制的概率图模型结构学习算法;其次,在不作分布假设的基础之上,针对马尔科夫网,研究提高结构学习算法处理高维数据能力的方法;最后,将本课题提出的结构学习算法应用于基因调控网和细胞信号传导通路的构建。我们的研究目标包含以下三方面:第一,从提出分布无关的条件独立性检验入手,针对基于约束的图模型结构学习方法,使其具有更广的适用范围;第二,建立在前一步基础之上,通过设计分布无关的得分函数,提出能够处理具有更高维度的马尔可夫网络的结构学习方法;第三,将本课题中的算法应用于真实的生物数据,由此验证已有的关于网络结构的假设,或发现新的变量间的相互作用。
{{i.achievement_title}}
数据更新时间:2023-05-31
演化经济地理学视角下的产业结构演替与分叉研究评述
玉米叶向值的全基因组关联分析
跨社交网络用户对齐技术综述
粗颗粒土的静止土压力系数非线性分析与计算方法
中国参与全球价值链的环境效应分析
基于概率图模型的视角无关人体动作建模与识别方法研究
高维图模型的结构空间及学习方法
基于概率图模型的分布数据流服务质量耦合机制研究
基于概率图模型的复杂行为识别