面向生物大数据分析的正则化方法及应用研究

基本信息

批准号：61672214

项目类别：面上项目

资助金额：63.00

负责人：廖波

学科分类：

依托单位：海南师范大学

批准年份：2016

结题年份：2020

起止时间：2017-01-01 - 2020-12-31

项目状态：已结题

项目参与者：刘军万,林红利,马亿旿,彭友松,梁莹,付祥政,李维彪,刘妮妮,侯英惠

关键词：

结构化稀疏流形学习正则化云计算

结项摘要

With the development of high-throughput technology, amounts of biological data are provided which make it possible for the early diagnosis and treatment of complex diseases. However, how to improve the reproducibility between association studies and complex disease, enhance the interpretability of results and make full use of the advantages of cloud computing platform are main concerns of biological related big data mining methods. In this study our data mining task is taken on a cloud platform that based on FPGA, including analyze CNV data, miRNA data, and protein-protein interactions data. The contributions of this study can be highlighted as follows: to provide a theoretical guidance for optimization experiments, regularization technology is applied to identify the most representative sample data; Combined with the existing biological data expression patterns, and mining the essential characteristics associated with complex diseases, finally the dimensionality reduction technology is achieved by using structured sparsity-norm; To make full use of data, a manifold learning based regularizer-term is added into nonnegative matrix factorization optimization problem for clustering usage and further improve the interpretability; For the.classification task with small sample size a non-parametric sparse representation based classifier is devised. Theoretical model and the actual utilization of biological significance can provide a new insight into revealing the pathogenesis of complex diseases and the scientific background of treating and drug designing in molecular level.

随着高通量芯片技术的发展，海量的生物数据为复杂疾病的早期诊断与治疗提供了可能。但是如何提高复杂疾病关联研究的可重复性，增强研究结果的可解释性以及如何充分利用云平台的计算优势是生物大数据分析的关键问题。本项目拟在基于FPGA技术的云平台上开展复杂疾病的大数据挖掘,以分析CNV数据、miRNA数据、蛋白质相互作用数据为基础，通过建立相应的优化模型来开展复杂疾病的关联研究：利用正则化技术找出最具代表性的样本数据，为优化实验提供理论指导；结合已有的生物数据表达模式，挖掘出复杂疾病相关的本质特征并利用结构化稀疏方法实现降维；充分利用数据的分布特点设计出一种基于流型正则与非负矩阵分解的聚类方法以提高关联研究的可解释性；提出一种无参数稀疏表达的小样本数据分类方法进而提高分类方法的扩展性；综合利用理论模型与实际生物意义为揭示复杂疾病的发生发展机制及临床诊断、治疗和药物设计提供了分子水平的科学依据。

项目摘要

项目成果

DOI：{{i.doi}}

发表时间：{{i.publish_year}}

暂无此项成果

数据更新时间：2023-05-31

其他相关文献

DOI：

发表时间：

DOI：10.14050/j.cnki.1672-9250.2017.02.014

发表时间：2017

DOI：10.13199/j.cnki.cst.2020.07.010

发表时间：2020

DOI：10.7498/aps.70.20202116

发表时间：2021

DOI：10.12011/setp2020-2080

发表时间：2022

廖波的其他基金

批准号：51404210

批准年份：2014

资助金额：25.00

项目类别：青年科学基金项目

批准号：60973082

批准年份：2009

资助金额：29.00

项目类别：面上项目

批准号：81700893

批准年份：2017

资助金额：21.00

项目类别：青年科学基金项目

批准号：60306013

批准年份：2003

资助金额：7.00

项目类别：青年科学基金项目

批准号：61873076

批准年份：2018

资助金额：66.00

项目类别：面上项目

批准号：11171369

批准年份：2011

资助金额：40.00

项目类别：面上项目

批准号：11926412

批准年份：2019

资助金额：20.00

项目类别：数学天元基金项目

批准号：61370171

批准年份：2013

资助金额：79.00

项目类别：面上项目

相似国自然基金

面向高维大数据的正则化统计方法的相关研究

批准号：71701223

批准年份：2017

负责人：杨虎

学科分类：G0105

资助金额：19.00

项目类别：青年科学基金项目

面向管理决策大数据分析的理论与方法

批准号：92046021

批准年份：2020

负责人：陈松蹊

学科分类：G0105

资助金额：130.00

项目类别：重大研究计划

面向大数据分析系统的任务调度优化方法研究

批准号：61672215

批准年份：2016

负责人：李智勇

学科分类：F06

资助金额：64.00

项目类别：面上项目

面向复杂情报的大数据分析方法与决策支持

批准号：U1435220

批准年份：2014

负责人：胡晓惠

学科分类：F0607

资助金额：500.00

项目类别：联合基金项目

面向生物大数据分析的正则化方法及应用研究

{{i.achievement_title}}

暂无此项成果

其他相关文献

基于国产化替代环境下高校计算机教学的研究

基于综合治理和水文模型的广西县域石漠化小流域区划研究

智能煤矿建设路线与工程实践

非牛顿流体剪切稀化特性的分子动力学模拟

中国出口经济收益及出口外资渗透率分析--基于国民收入视角

廖波的其他基金

基于深厚表土中立井井壁水平侧压力试验的导电复合薄膜测试机理及技术研究

蛋白质组信息分析及应用算法研究

IL-37-Mex3B-TSLP轴在嗜酸粒细胞性伴有鼻息肉的慢性鼻-鼻窦炎发病机制中的作用

系统芯片的高效定点微机电热电制冷技术研究

基于多源生物网络融合的肥胖与疾病关联研究

肿瘤基因表达谱数据分析及应用算法研究

大数据与数学文化传播研讨会

大规模SNP数据挖掘及其在复杂疾病分析中的应用研究

相似国自然基金