With the development of the SNP microarray chip technology, the genome-wide association study has become an important means of identifying complex disease pathogenic sites. However, how to overcome the low reproducibility of positive results and the difficulty of explanatory of association results remains a daunting challenge. This project will utilize the computational advantage of the supercomputer platform to carry out genome-wide epistasis analysis based on complex disease SNP microarray data, and integrate priori information such as biological pathways and molecular interaction networks and so on. Firstly, we mine essential characteristics of complex disease data combining with the genetic pattern of reliable reference data. After that, we utilize dimensionality reduction and clustering methods for removing noise data and imputing missing data as well as rare SNP genotype. Secondly, in order to effectively select tag SNPs which are reasonably representative of rare and common SNP, we design a multilocus correlation measure which has clear biological significance to heuristically select tag SNPs and build mathematic model to evaluate its performance. Thirdly, to study multiple loci interaction and identify weak pathogenic disease sites of complex disease, we build a multilevel fusion model to analyze biological pathway and molecular interaction network, and then design parallel algorithm to make full use of computational advantage of supercomputer platform. Lastly, we comprehensively utilize gene ontology annotation information to explain the results of genome-wide epistasis analysis and further reveal the pathogenic mechanism of complex diseases.
随着SNP微阵列芯片技术的发展,全基因组关联研究已成为识别复杂疾病致病位点的重要手段,但如何提高关联研究的重复性,增强研究结果的解释力是全基因组关联研究的瓶颈问题。本项研究将利用超算平台的计算优势,以复杂疾病SNP数据分析为基础,融合生物通路、分子作用网络等先验信息开展全基因组上位性研究:结合可靠参考数据的遗传模式,挖掘复杂疾病数据的本质特征,综合利用降维以及聚类方法剔除噪声数据、推测缺失数据以及稀罕SNP基因型;设计具有明确生物意义的多位点关联度量,启发式地构造标签SNP子集,并构建数学模型进行性能评估,有效地从全基因组中选择标签SNP;基于生物通路与分子作用网络构建多层次融合分析模型,并在超算平台下设计并行算法研究多位点高阶上位性,进而识别复杂疾病的多弱效致病位点;综合利用基因本体注释等信息解释研究结果,进一步揭示复杂疾病的致病机理。
随着SNP微阵列芯片技术的发展,全基因组关联研究已成为识别复杂疾病致病位点的重要手段,但如何提高关联研究的重复性,增强研究结果的解释力是全基因组关联研究的瓶颈问题。本项研究主要针对复杂疾病SNP数据分析方法与复杂疾病分析模型展开研究,利用超算平台的计算优势,以复杂疾病SNP数据分析为基础,融合生物通路、分子作用网络等先验信息开展全基因组上位性研究:结合可靠参考数据的遗传模式,挖掘复杂疾病数据的本质特征,综合利用降维以及聚类方法剔除噪声数据、推测缺失数据以及稀罕SNP基因型;设计具有明确生物意义的多位点关联度量,启发式地构造标签SNP子集,并构建数学模型进行性能评估,有效地从全基因组中选择标签SNP;基于生物通路与分子作用网络构建多层次融合分析模型,并在超算平台下设计并行算法研究多位点高阶上位性,进而识别复杂疾病的多弱效致病位点。综合利用基因本体注释等信息解释研究结果,进一步揭示复杂疾病的致病机理。
{{i.achievement_title}}
数据更新时间:2023-05-31
玉米叶向值的全基因组关联分析
监管的非对称性、盈余管理模式选择与证监会执法效率?
农超对接模式中利益分配问题研究
基于 Kronecker 压缩感知的宽带 MIMO 雷达高分辨三维成像
宁南山区植被恢复模式对土壤主要酶活性、微生物多样性及土壤养分的影响
基于高阶SNP互作挖掘与分析的复杂疾病全基因组关联研究
疾病相关SNP位点挖掘与SNP功能注释系统的研究
时滞复杂系统的集群特征及其在数据挖掘中的应用研究
复杂疾病的全基因组SNP互作网络构建与分析