Large-scale cancer genomic projects, such as The Cancer Genome Atlas (TCGA), have revealed a complex landscape of somatic mutations in multiple cancer types. A major goal of these projects is to characterize somatic mutations and discover cancer drivers, thereby providing important clue to uncover diagnostic or therapeutic targets. However, distinguishing only a few somatic driver mutations from the majority of passenger mutations is still a major challenge. Combining other functional features with mutations is an effective approach to predict cancer driver genes. Protein post-translational modification (PTM) is such an important functional feature that is known to play critical roles in the development of cancer. In this proposal, we plan to systematically analyze the somatic mutations on protein post-translational modifications, and identified several important drivers that responsible for tumorigenesis. From published literatures, we will collect a complete set of PTM sites, and construct prediction models with deep learning algorithm. To identify driver mutations that significantly altered protein modifications, we will further develop a hierarchical Bayesian models for statistical test. Also, to obtain a complete understanding of PTM variations in cancer tissue, we will establish a regression model using random forest algorithm for identifying somatic mutations that significantly altered the expression level of PTM-related enzymatic systems. Combining the above computational pipeline, we expect that our proposal may benefit the discovery of novel treatments for cancer patients.
大规模的癌症基因组测序计划已经在多种癌症中揭示了体细胞突变的图谱。针对体细胞突变分析新的癌症驱动基因已成为当前研究的重点。然而,如何在大量乘客突变中发现少数的驱动突变仍是目前的研究瓶颈。通过整合其他调控信息来预测癌症驱动突变是解决该问题的关键切入口。作为一种在癌症发生发展中起重要作用的分子机制,蛋白质翻译后修饰可以有效地辅助驱动突变的鉴定。本申请中,申请人将针对翻译后修饰设计相关算法,实现驱动基因的鉴定。通过文献阅读,申请人将收集完整的翻译后修饰位点,并利用深度学习算法构建预测模型。结合体细胞突变数据,申请人将开发一个基于层次贝叶斯模型的显著翻译后修饰相关突变鉴定算法。同时,为了完整地研究翻译后修饰系统的变异情况,申请人还将建立一个基于随机森林回归的修饰酶突变分析流程。整合上述方法,本申请将在组学水平上鉴定翻译后修饰相关的癌症变异,为后续开发癌症的新治疗方法提供指导。
为了解决在大量体细胞乘客突变中寻找癌症驱动突变的问题,本课题组从蛋白质翻译后修饰入手,设计相关算法,并最终实现驱动基因的鉴定。在此项目中,课题组首先收集了硝基化、亚硝基化、SUMO化等33种蛋白相关修饰的位点数据,并以此构建了多个蛋白质翻译后修饰数据库。随后,基于所收集的位点数据,课题组利用深度学习算法开发了DeepNitro、DeepSUMO、DeepPhagy等多个高效的蛋白质翻译后修饰位点预测工具。为探索蛋白质翻译后修饰与疾病相关突变的关系,课题组从TCGA中收集癌症病人与Ⅱ型糖尿病的GWAS数据集 并进行分析,成功开发了基于层次贝叶斯算法的显著蛋白质翻译后修饰相关突变鉴定算法,构建了在线分析工具PTMsnp,鉴定筛选出了一系列癌症相关突变。为了进一步探究 相关突变的下游功能,课题组还建立了RMVar、BBCancer、SPENCER等多个可用于癌症生物大分子数据分析的完整数据库。在本课题的支持下,本课题组在Autophagy, Nucleic Acids Research,GigaScience等杂志发表SCI论文9篇,其中IF>10的论文5篇。
{{i.achievement_title}}
数据更新时间:2023-05-31
基于SSVEP 直接脑控机器人方向和速度研究
内点最大化与冗余点控制的小型无人机遥感图像配准
基于公众情感倾向的主题公园评价研究——以哈尔滨市伏尔加庄园为例
氯盐环境下钢筋混凝土梁的黏结试验研究
F_q上一类周期为2p~2的四元广义分圆序列的线性复杂度
磷酸化相关的蛋白质翻译后修饰Crosstalk及激酶的计算识别方法和分析研究
激酶中多种蛋白质翻译后修饰分析探索性研究
基于功能核酸的蛋白质翻译后修饰荧光成像分析
针对癌症关键蛋白翻译后修饰的靶向蛋白质组学CE-MS方法研究