Mass spectrometry (MS) is one of the key analytical techniques in metabolomics research. The most commonly metabolite annotation approach for untargeted metabolomics is similarity search using MS reference databases. Untargeted metab¬olomics based on high resolution mass spectrometry (HRMS) collect more and more information. The coverage of mass spectrometry database of measured reference metabolites are far from covering the complex metabolome. The identification of unknown metabolites is the main bottleneck in metabolomics. This project aims at development a deep annotation method for non-targeted metabolomics based on bioinformatics and high resolution mass spectrometry and its application. First, computational-based (in silico) simulation will be used to generate a predicted in silico reference metabolite library through bioinformatics pathway and the experimental ultra high performance liquid chromatography- multi-stage mass spectrometry (UHPLC-MSn) database created by our team using 2000 authentic metabolite standards under standard operating procedure. In silico MS fragmentations will be predicted under fragmentation rules and experimental MS/MS. Quantitative structure-retention relationship (QSRR) model will be built to predict chromatographic retention time. The developed in silico UHPLC-high resolution mass spectrometry multi-stage mass spectrometry (HRMSn) will obviously enlarge the coverage of mass spectrometry spectral databases. Next, deep metabolome annotation method will be carried out. Data acquisition and deconvolution algorithms will be developed for generation of the information-rich fragments with high resolution and mass accuracy. Metabolites with similar fragmentation patterns correlate with strong chemical similarity. The experimental networks based on MS fragment similarity or metabolite-metabolite correlation will be constructed to obtain tentative structural information by the alignment of unknown to known. Non-targeted metabolomics will be annotated using the above method combined with the current database and the developed in silico UHPLC-HRMSn database. Finally, the established method will be applied to investigate the global metabolic changes in diabetes, especially metabolic variations in host-microbiota interaction pathway. Host-microbiota cometabolites closely related to pathological conditions will be discovered. The results will be benefit for the prevention and treatment of diseases.
本项目针对目前高分辨质谱非靶向代谢组学数据信息利用率低,未知代谢物注释困难的瓶颈问题,开展基于生物信息学和高分辨质谱技术的代谢组深度注释新方法及其应用研究。首先,在研究团队前期研发的2000种代谢物标样的实验UHPLC-MSn数据库基础上,结合现有知识库构建in silico UHPLC-HRMSn数据库,用于扩大代谢物的覆盖范围;其次,开展代谢组深度注释方法研究,拟发展高效获取高分辨/高质量精度的富含结构信息的数据采集方法,研究从中可靠提取母离子和子离子对应关系(MS1-MS2)及离子间演化关系(MSn)的数据解析方法,发展获取代谢物子结构信息或化学类别信息的方法,用于提高实验数据搜库的鉴定效率和成功率;最后,将所建立的新方法进行示范应用,开展肠道菌群-宿主共代谢物与糖尿病关系研究,寻找与病理条件密切相关的肠道菌群-宿主共代谢物,为疾病的预防和治疗奠定科学基础。
随着高分辨质谱技术的快速发展,对高分辨质谱数据的高效解析与利用显得尤为迫切。如何有效扩大质谱数据库的代谢物覆盖范围,以及高效获取和利用高分辨/高质量精度的质谱数据,实现非靶向代谢组学数据深度注释,特别是未知代谢物结构注释是当前亟待解决的关键科学问题。本项目针对基于高分辨质谱代谢组学数据利用率低,代谢物注释困难的难题,以生物样本为研究对象,开展基于生物信息学和高分辨质谱技术的代谢组深度注释新方法及其应用研究。包括:以超过千种代谢物的实验数据为基础,结合机器学习算法,构建了液相色谱保留时间预测模型,实现了代谢物的保留时间准确预测。以重要代谢物羟基肉桂酸酰胺以及糖苷类化合物为例,通过已知代谢反应理论预测可能存在的代谢物合集,进而发展了基于in silico UHPLC-MSn的重要代谢物深度注释方法,该方法充分利用途径信息/质谱特征碎裂模式,实现了代谢物高效注释,且有利于发现新代谢物。研发了基于修饰代谢组的规模化注释方法,实现了尿液代谢组的规模化注释。将所研发的新方法进行示范应用,开展了肠道菌群-宿主共代谢物以及糖尿病高风险人群预警研究。区别现有研究策略,本项目从in silico代谢物数据库构建、高分辨/高质量精度的富含结构信息质谱数据获取/利用两方面着手,极大提高了代谢组定性效率,从而实现了代谢组深度注释。本项目相关成果发表期刊论文7篇,申请中国发明专利7件(其中已授权3件),软件著作权1件,培养毕业博士研究生2人。
{{i.achievement_title}}
数据更新时间:2023-05-31
DeoR家族转录因子PsrB调控黏质沙雷氏菌合成灵菌红素
基于 Kronecker 压缩感知的宽带 MIMO 雷达高分辨三维成像
小跨高比钢板- 混凝土组合连梁抗剪承载力计算方法研究
转录组与代谢联合解析红花槭叶片中青素苷变化机制
双吸离心泵压力脉动特性数值模拟及试验研究
蛋白质组学质谱数据的深度学习分析技术
基于二维液相色谱质谱技术的深度覆盖脂质组学新方法建立及在糖尿病研究中的应用
基于直接质谱技术的蜂蜜及其主要掺假物代谢组学研究
基于液滴-质谱联用的单细胞精度微生物组代谢分析新方法研究