Multi-instance multi-label learning is a machine learning framework proposed recently for solving the problem of multi-semantic data. Because it can provide a possibility for explaining why a concerned sample has the certain class labels, multi-instance multi-label learning framework is attracting more and more attention. Gaussian process model is a kernel method that has many merits such as being implemented easily, adaptively discovering the relationship among variables. This project aims at developing a novel multi-instance multi-label learning algorithm based on Gaussian process model for solving the problem of large-scale incompletely annotated multi-semantic data. It includes research to solve the problem of simultaneously describing the relationship between instances and labels as well as the relationship among labels by designing a new Gaussian process model, to solve the large-scale training data problem by proposing an solving approach with lower computational cost for Gaussian process model based on stochastic variational inference, to solve the incompletely annotated data problem by developing a two-step strategy based on ideas of positive and unlabeled learning. Based on Gaussian process model, we not only develop a model that can simultaneously describe the relationship between instances and labels as well as the relationship among labels, which is a key problem for developing multi-instance multi-label learning algorithm, but also solve the problem that kernel methods is difficult to process large-scale training data. This project will promote the application of multi-instance multi-label learning in big data.
多示例多标记学习是近年来提出的一种处理多义性数据的新机器学习框架,由于它为挖掘样本与其类别标记间的驱动关系提供了可行性,正受到越来越多的关注。高斯过程模型是一种核方法,具有易实现、可自适应地挖掘关系信息等优点。本项目旨在基于高斯过程模型建立一种面向大规模未完全标注多义性数据的多示例多标记学习算法,拟先通过设计一种新结构的高斯过程模型,解决同时挖掘示例与标记间关系和标记与标记间关系这两种重要信息的问题;然后基于随机变分推理法建立一种复杂度较低的模型求解方法,解决处理大规模训练数据的问题;最后借助PU学习技术的思想建立一种两阶段策略,解决有效利用未完全标注数据的问题,从而达到最终目的。本项目利用高斯过程模型不仅解决了同时挖掘示例与标记间关系和标记与标记间关系这个算法构建的核心问题,还解决了核方法复杂度过高不宜处理大规模数据的问题,可有效推动多示例多标记学习技术在大数据中的应用。
随着大数据时代的到来和人工智能技术的广泛应用,各个领域和行业都把数据看作一种战略资产进行收集、存储和分析,而多义性、大规模、弱标记已成为数据的几种普遍特性。多示例多标记学习是近年来提出的一种处理多义性数据的新机器学习框架,由于它为挖掘样本与其类别标记间的驱动关系提供了可行性,正受到越来越多的关注。高斯过程模型是一种核方法,具有易实现、可自适应地挖掘关系信息等优点。本项目利用高斯过程模型对面向大规模未完全标注多义性数据的多示例多标记学习算法构建问题进行了研究,先设计了一种新结构的高斯过程模型,解决同时挖掘示例与标记间关系和标记与标记间关系这两种重要信息的问题;然后基于诱导变量策略、拉普拉斯后验概率逼近方法和稀疏嵌入技术建立了一种新的高斯过程模型求解方法,并基于此建立了一种面向大规模数据的核多示例多标记学习算法;最后借助自步学习的思想,利用权重调整策略将所建算法进一步拓展,建立了最终的面向大规模未完全标注多义性数据的多示例多标记学习算法。本项目的研究成果可用于解决多义性、大规模、弱监督数据的挖掘问题,具有重要的理论意义和实用价值。
{{i.achievement_title}}
数据更新时间:2023-05-31
温和条件下柱前标记-高效液相色谱-质谱法测定枸杞多糖中单糖组成
基于多模态信息特征融合的犯罪预测算法研究
基于公众情感倾向的主题公园评价研究——以哈尔滨市伏尔加庄园为例
惯性约束聚变内爆中基于多块结构网格的高效辐射扩散并行算法
基于协同表示的图嵌入鉴别分析在人脸识别中的应用
多示例多标记学习中的最优化方法及其应用
基于最大间隔的多示例学习算法设计与分析
多尺度高斯过程模型及其学习曲线研究
基于特征学习和标记关联的多标记学习算法研究