Dimensionality Reduction (DR) has been considered as one of the crucial preprocessing steps for many tasks. It can be used to eliminate the noise in the observed data, reduce the time and space costs of the subsequent algorithms, and improve the performance of the final learning system. Based on the strong representation power of deep learning models, DR methods based on deep learning have shown better performance than those based on shallow models, but as there are too many parameters for the DR methods based on deep learning which are constructed by multinomial logistic model with weak representation power, they cannot be used in small-size data directly. To address this issue, this proposal aims to study the kernel methods based deep learning models for DR in small-size data. The project can be divided into two parts: from the view of models, we firstly study Gaussian process based deep learning models, and then we extend the models to the kernel methods based deep learning framework by using the sparse kernel machine model. It is expected that the stronger generation ability leads to better performance in the DR tasks in small-size data; from the view of data, we combine the semi-supervised techniques and self-taught approaches to the kernel methods based deep learning framework for DR. With the knowledge transferring from other data, we can settle the DR issues in small-size data efficiently, The outcomes of the research can be seen as the extensions of existing deep learning framework, and meanwhile a series of DR models will be developed to efficiently handle the small-size data. Thus, this project is meaningful for both theories and applications.
数据降维被广泛用于数据预处理,它能剔除原始数据中的噪声,降低后续算法的时间和空间开销,提高最终学习系统的性能。借助深度模型较强的模型表示能力,基于深度学习的降维模型显示出优于浅层降维模型的性能,但由于传统的基于深度学习的降维模型大多是用表达能力较弱的多元逻辑回归模型来构造且模型参数众多,不能直接应用于小样本数据中,因此本项目针对此问题致力于研究面向数据降维的深度核机器算法。本项目的研究分为两部分:从模型层面,首先研究基于高斯过程的深度学习算法,然后进一步基于稀疏核机器算法提出一个面向数据降维的深度核机器模型框架,泛化能力的提高使模型在小样本数据降维任务中的性能得到增强;从数据层面,将半监督和自主学习技术与模型部分的研究成果相结合,利用外部数据解决小样本数据降维问题。项目的研究成果既能为数据降维问题提供有效的解决方案,又能扩展现有的深度学习框架,具有重要的理论和应用价值。
大量的高维数据的出现给数据分析和数据挖掘带来了极大的挑战,本项目对面向降维的深度核机器算法展开研究,在本项目的支持下,我们在降维理论和算法上取得了一些成果并在高光谱遥感图像分类、油藏模拟中的历史拟合等问题中得到了检验和成功应用,另外,项目组还在图像超分辨率等新的领域进行了探索,为项目的后序研究奠定基础。本项目主要的研究工作有:(1)提出了融合高斯过程和深度自编码机模型的有监督降维模型;(2)提出了基于共享隐变量模型的共享核学习算法以及以此为基础的一类混合非参数模型和有参数的深度学习模型的算法;(3)针对高光谱遥感图像分类、油藏模拟中的历史拟合等存在高维数据的问题,将项目组提出的算法进行检验和应用,取得了突破性的结果;(4)提出了新的空间感知的协同表示模型和平滑的稀疏表示模型;(5)本项目的具体研究成果包括:培养博士生1名,硕士生2名,本科生5名;发表与课题相关的学术论文5篇,其中国际SCI刊物4篇,国际EI会议论文1篇,另有2篇SCI期刊论文和一篇EI会议论文正在投稿中。
{{i.achievement_title}}
数据更新时间:2023-05-31
论大数据环境对情报学发展的影响
基于 Kronecker 压缩感知的宽带 MIMO 雷达高分辨三维成像
内点最大化与冗余点控制的小型无人机遥感图像配准
基于公众情感倾向的主题公园评价研究——以哈尔滨市伏尔加庄园为例
栓接U肋钢箱梁考虑对接偏差的疲劳性能及改进方法研究
面向复杂数据基于流形学习的非线性降维算法研究
基于概率图模型的数据降维算法研究
生物特征识别中高维数据的统计降维及算法研究
面向高维数据集成降维的半监督聚类方法研究