Human Genome Project, started from decoding the 1D primary genomic sequence (e.g. ENCODE Project) and continued through delineating 2D epigenomic profiles for hundreds of cell types (e.g. Roadmap Epigenome Mapping Project), has entered the era of dissecting 4D chromatin architecture - 3D spatial contact structure and its temporal dynamics (e.g. 4D Nucleome Project). Chromatin is a complex of genomic DNA and proteins that make up the chromosomes within the nucleus of a cell. The organization of genomic material into chromatin is presumed to play an important role in regulating expression of genes. However, the precise relationship between spatial genome organization and expression of resident genes in health and disease remains unclear. Toward understanding 3D genome architecture and its relationship to gene regulation, new high through next-gen sequencing based technologies, such as Hi-C, ChIA-PET, etc. have emerged to allow in depth investigation of 3D chromatin interaction at the genomewide global level. In this proposed study, we aim at solving some specific technical problems that have hindered the progress in this rapidly developing new field. In particular, we will take the advantage and strength of our research team, focusing on (1) developing new BL-Hi-C technology to substantially reduce the noisy in the conventional Hi-C method and at the same time to increase the resolution and enrichment for active promoter/enhancer DNA loops; (2) developing related data analysis algorithm and further improving ChIA-PET and Hi-C/BL-Hi-C statistical models; (3) developing stochastic process data analysis tools for comparing differential changes and dynamics; (4) developing 4D genome integrative data analysis pipeline/platform and user-friendly visualization tools. In addition to explore modern machine learning with existing public data and to benchmark mathematical models with simulations, we plan also to apply and validate our new methods with human blood (both normal and leukemia) cells (K562, Hi-60, NM4,Kusumi,SKNO).
人类对基因组的认识起始于解码1D基因组序列(HGP,ENCODE),发展于揭示基因组表观遗传修饰信号的2D基因组(Epigenetic Roadmap),如今已迈入探索4D基因组的新纪元(染色质三维结构及其动态变化,如4D Nucleome )。针对基因组三维结构及其功能研究,人们开发出Hi-C、ChIA-PET等高通量测序技术,但仍面临若干关键技术难题。因此,我们将结合本课题组自身优势,在本研究中:(1)开发BL-Hi-C技术以降低Hi-C技术的实验噪声,整合表观遗传信号提升染色质相互作用研究的分辨率;(2)开发配套的数据处理算法,并改进ChIA-PET/Hi-C/BL-Hi-C数据处理的统计模型;(3)基于随机过程开发比对时序染色质三维结构数据的计算工具;(4)整合现有工具,搭建完整的4D基因组数据分析和展示平台。最终我们计划将该技术方法体系在人类血液/白血病细胞模型中进行有效性测评。
针对基因组三维结构及其功能研究,人们开发 出Hi-C、ChIA-PET等高通量测序技术,但仍面临噪音背景高,分辨率不足等若干关键技术难题。在本研究中,我们针对基因promoter区域,开发了BL-Hi-C与CAPTURE两种实验技术,降低了4D 基因组测序的噪声,提升染色质相互作用的分辨率。开发了配套的数据处理算法Hi-CDB,其获得的染色质结构域边界能够富集细胞特异的转录因子以及细胞特异的基因表达。在对比不同状态下染色质构象方面,我们考虑了DNA序列邻近位置间的空间依赖性,因此利用空间泊松分布过程来寻找与特定位置与及其邻近位置都存在染色质相互作用显著变化的染色质区域,提出了新算法FIND。综上所述,我们建立了一套国际一流的 4D 基因组测序技术体系。运用该套技术体系,我们在胚胎发育早期,肌肉组织发育,血液肿瘤等不同体系中,从染色质三维空间构象、表观遗传状态、基因转录水平三个层面,揭示了染色质三维空间构象改变对细胞命运决定的调控作用。
{{i.achievement_title}}
数据更新时间:2023-05-31
涡度相关技术及其在陆地生态系统通量研究中的应用
硬件木马:关键问题研究进展及新动向
端壁抽吸控制下攻角对压气机叶栅叶尖 泄漏流动的影响
基于公众情感倾向的主题公园评价研究——以哈尔滨市伏尔加庄园为例
栓接U肋钢箱梁考虑对接偏差的疲劳性能及改进方法研究
基于全基因组测序建立病毒拷贝数和整合位点的生物信息学方法及数据分析
基于全基因组重测序技术的糜子起源与进化研究
基于宏基因组测序的病毒株序列重建与识别方法
一种利用优化的简化基因组测序技术辅助动植物全基因组De novo拼装的新技术方法和策略