The decreasing sequencing price accelerates the accumulation of biological priori knowledge and makes the implementation of sequence-based whole genome prediction (WGP) possible in livestock breeding. Recent evidences demonstrated that the predicting accuracy could be slightly improved by incorporating the biological priori knowledge into WGP model. However, further advantages of sequence-based WGP are yet to be uncovered, especially in the area of innovation of prediction model and strategy of incorporating biological priori knowledge. Based on our previous research, a novel sequence-based WGP model, which is different from the previous WGP models that the gene effects rather than SNP marker effects are fitted, will be proposed first in the project. Then the biological priori knowledge adapted from multiple bioinformatics databases will be incorporated into the proposed new WGP model. Meanwhile, a research population that the key individuals have their genomes sequenced will be constructed, and the determination of key individuals, the sequence plan, and the strategy of genotype imputation in such population will be optimized on a high performance computing platform as well. Finally, the performance of proposed new WGP model will be carefully and detailedly validated in the constructed research population and other datasets. The main purpose of this project is to propose a novel sequence-based WGP mode that could efficiently incorporate biological priori knowledge, and the results of this project will also provide usefully fundamental knowledge for the design of sequence-based WGP breeding scheme in livestock breeding.
测序价格的不断下降加速了生物学先验信息的累积,并使得基于序列的全基因组预测(whole genome prediction, WGP)在畜禽育种中的应用成为可能。近期研究表明:利用生物学先验信息有助于提高基于序列的WGP准确性,但在预测模型的理论创新及生物学先验的利用策略方面尚缺少系统研究。本项目拟在前期研究基础上,首先突破经典WGP模型中基于SNP芯片数据以标记为基本单位的理论限制,构建基于序列数据以基因为基本单位的WGP新模型;然后整合多生物信息学数据库的生物学先验信息,并将其应用于WGP新模型;同时,利用高性能计算平台全面对比畜禽群体关键个体筛选、测序及基因型填充方案,完成构建实际研究群体;最后将提出的WGP新模型应用于实际研究群体及其他数据集,验证其实际应用效果。本项目的研究结果不仅可以为基于序列的WGP提供新方法,而且可以为制定基于序列的WGP在畜禽群体中的应用方案提供理论依据。
基于全基因组序列数据的基因组选择(Genomic Selection, GS)是当前全球动植物育种的研究热点。其关键在于如何降低群体序列数据的获取成本,并有效利用日益累积的生物学先验信息以提高GS的准确性。在该项目基金的支持下,我们开展了一系列畜禽群体全基因组序列获取策略的研究,对基于芯片及测序数据的全基因组选择模型的比较研究,并开展对整合生物学先验信息的基因组预测新模型开发。主要研究内容及结果概括如下:(1)开展了畜禽群体基因型填充策略对比研究,优化了基因型填充方案,提高了基因型填充质量。通过基因型填充获取全群的序列数据,降低了群体全基因组序列数据获取成本。(2)对比了基于芯片及测序数据的全基因组选择模型,探究了芯片和测序数据以及不同预测模型对合并群体基因组选择准确性的影响。(3)完成了生物学先验信息的提取、筛选和转化方案,建立了生物学先验信息与序列数据的映射,将生物学先验信息整合在基因组预测的新模型中,有效提高了基因组预测新模型对复杂性状预测能力,并在多种数据集中进行验证。(4)开发了基于生物学先验信息的基因组选择方法,基于生物学先验标记筛选方法系统地探讨了不同亲缘关系矩阵构建、非线性核预测模型等各方面因素对遗传评估效果的影响,新方法提高了基因组选择准确性。该项目共发表研究论文17篇,获得软件著作权5项,培养博士研究生2名,硕士研究生3名。
{{i.achievement_title}}
数据更新时间:2023-05-31
玉米叶向值的全基因组关联分析
基于LASSO-SVMR模型城市生活需水量的预测
基于多模态信息特征融合的犯罪预测算法研究
基于全模式全聚焦方法的裂纹超声成像定量检测
多源数据驱动CNN-GRU模型的公交客流量分类预测
基于整合组学先验知识的全基因组关联分析模型开发
基于全基因组序列信息建立中国荷斯坦牛基因组选择新方法
利用基因产物组装全基因组序列的新方法开发及应用
基于全基因组预测的配合力分析方法研究