基于邻近局部切空间相似性的多流形学习研究

基本信息

批准号：61202285

项目类别：青年科学基金项目

资助金额：22.00

负责人：邵超

学科分类：

依托单位：河南财经政法大学

批准年份：2012

结题年份：2015

起止时间：2013-01-01 - 2015-12-31

项目状态：已结题

项目参与者：张斌,张啸剑,郑娅峰,万春红,刘永超

关键词：

谱聚类模型选择准则局部切空间鲁棒性多流形学习

结项摘要

To learn the intrinsic low-dimensional geometric structure of high-dimensional multi-manifold datasets effectively, several supervised and unsupervised multi-manifold learning algorithms were presented. These multi-manifold learning algorithms can classify or cluster multiple manifolds relatively effectively, but may not display the intrinsic geometric structure of some manifolds successfully, have relatively poor robustness and generalization, may not partition all the manifolds precisely and then cannot judge the number of manifolds existing in the data effectively. To solve these problems, according to the locally Euclidean nature of the manifold, this project studies the supervised and unsupervised multi-manifold learning algorithms based on the similarities between neighboring local tangent spaces, including the study of shortest path algorithm suitable for the multi-manifold structure, the generalization study of multi-manifold learning algorithms based on the similarities between neighboring local tangent spaces, the robustness study of multi-manifold learning algorithms based on mean distance, and the study of spectral clustering algorithm based on the similarities between neighboring local tangent spaces. On the one hand, the intrinsic geometric structure of each manifold can be displayed in the low-dimensional embedding space successfully by computing the shortest pathes suitable for the multi-manifold structure, and the robustness of multi-manifold learning algorithms can be improved by using the mean distances between the data which can restrain the noise to a certain extent; On the other hand, based on the similarities between neighboring local tangent spaces, the generalization of multi-manifold learning algorithms can be improved by judging the manifolds of new data points precisely, and all the manifolds can be partitioned precisely and then the number of manifolds existing in the data can be judged effectively by constructing the suitable neighborhood graph. The projected research results can enhance theoretical and practical applications of multi-manifold learning furtherly.

为了学习高维多流形数据集的内在低维几何结构，人们提出了监督和无监督多流形学习算法，能比较有效地实现多流形的分类和聚类，但在成功展现各流形的内在几何结构、鲁棒性、泛化能力、准确划分各流形进而有效判定流形个数等方面的能力还比较有限。本项目根据流形的局部欧氏特性，研究基于邻近局部切空间相似性的监督和无监督多流形学习算法，具体包括多流形下的最短路径算法研究，基于邻近局部切空间相似性的泛化能力研究，基于期望距离的鲁棒性研究和基于邻近局部切空间相似性的谱聚类算法研究。一方面，研究多流形下的最短路径算法，使各流形的内在几何结构得以成功展现，并研究采用对噪音具有一定抑制能力的期望距离，增强算法的鲁棒性；另一方面，根据邻近局部切空间的相似性能更准确地判别新数据点所属的流形，提高算法的泛化能力，并能创建合适的邻域图，进而准确划分各流形并有效判定流形的个数。预期研究结果将为多流形学习的研究和应用注入新的活力。

项目摘要

随着大数据时代的到来，数据降维及其可视化的重要性日益凸出。近年来，人们发现实际中的很多高维数据可能采样于多个低维非线性流形，为此，人们把流形学习算法发展成为多流形学习算法。.目前的监督多流形学习算法大都根据数据的类别标记对彼此间的距离进行调整，能比较有效地实现多流形数据的分类，但会扭曲某些流形的内在几何结构，鲁棒性和泛化能力也比较差。为此，本项目提出了一种基于等距映射的监督多流形学习算法。该算法采用适合于多流形的最短路径算法，得到在多流形下依然能正确逼近相应测地距离的最短路径距离，最终能成功展现各流形的内在几何结构，且具有较高的鲁棒性；此外，该算法根据同一流形上邻近局部切空间的相似性能准确判定新数据点所在的流形，从而具有较强的泛化能力。.众所周知，核等距映射算法具有良好的泛化能力，但不能直接用于多流形数据的分类。为此，本项目提出了用于多流形分类的核等距映射算法，通过计算多流形数据点之间的最短路径距离，并根据同一流形上邻近局部切空间的相似性，不但能准确判定新数据点所在的流形，而且还能比较准确地计算新数据点的低维嵌入，从而使该算法能用来对多流形数据进行分类，同时保持了其良好的泛化能力。.传统的流形学习算法对邻域大小参数和数据中的噪声都比较敏感，从而使其鲁棒性较差、难以适应实际高维数据的需要。为此，本项目提出了基于加权主成分分析和贝叶斯信息准则的邻域大小参数的递增式选取方法。根据流形的局部欧氏性，该方法采用贝叶斯信息准则对邻域图上所有邻域的重建误差（由加权主成分分析算法得到，用来作为相应邻域的线性度量）所聚成的类别个数进行探测，从而能递增式地选取合适的邻域大小参数。该方法无需任何额外参数，且无需运行耗时的流形学习算法，从而具有较高的运行效率。此外，本项目还采用鲁棒性较高的自组织映射来进行流形学习与可视化，根据流形的局部欧氏特性对获胜神经元的选取和学习规则进行局部化处理，并使网络规模随训练样本同步扩张，不但能获得良好的学习与可视化效果，而且对邻域大小参数和数据中的噪声具有更低的敏感度。.传统流形学习算法鲁棒性差的主要原因在于，其邻域图是基于欧氏距离进行创建的，而欧氏距离只是一种线性度量。为此，本项目采用更加鲁棒且能反映数据非线性几何结构的通勤时间距离来创建既稠密又不容易产生“短路”边的邻域图，从而可以在成功展现各流形内在几何结构的同时，还具有更好的聚类效果和更高的鲁棒性。

项目成果

DOI：{{i.doi}}

发表时间：{{i.publish_year}}

暂无此项成果

数据更新时间：2023-05-31

其他相关文献

DOI：10.16796/j.cnki.1000-3770.2022.03.003

发表时间：2022

DOI：

发表时间：2021

DOI：10.7524 /j.issn.0254-6108.2017122903

发表时间：2018

DOI：10.7606/j.issn.1000-7601.2021.04.29

发表时间：2021

DOI：10.1051/jnwpu/20213920292

发表时间：2021

邵超的其他基金

相似国自然基金

基于黎曼空间模型的多模态Web图像流形学习及检索研究

批准号：61170093

批准年份：2011

负责人：何儒汉

学科分类：F0211

资助金额：45.00

项目类别：面上项目

基于子空间特征相似性的数据驱动主动学习控制

批准号：61873139

批准年份：2018

负责人：池荣虎

学科分类：F0301

资助金额：66.00

项目类别：面上项目

基于多流形度量学习的多视角步态识别研究

批准号：61573114

批准年份：2015

负责人：王科俊

学科分类：F0605

资助金额：64.00

项目类别：面上项目

局部 Hermite 对称空间的复子流形

批准号：11501205

批准年份：2015

负责人：吴瑞聪

学科分类：A0107

资助金额：18.00

项目类别：青年科学基金项目

基于邻近局部切空间相似性的多流形学习研究

{{i.achievement_title}}

暂无此项成果

其他相关文献

EBPR工艺运行效果的主要影响因素及研究现状

基于铁路客流分配的旅客列车开行方案调整方法

珠江口生物中多氯萘、六氯丁二烯和五氯苯酚的含量水平和分布特征

向日葵种质资源苗期抗旱性鉴定及抗旱指标筛选

一种基于多层设计空间缩减策略的近似高维优化方法

邵超的其他基金

相似国自然基金