Rough sets, as a popular paradigm of granular computing, is an effective method for dealing with uncertain, imprecise, or incomplete data either with the label or without the label. However, in many practical applications such as information retrieve, anti-spam, and image classification, it is often the case that the problem involves both labeled and unlabeled data, called as partially labeled data, in which the labeled data are fairly expensive to obtain since labeling example requires much human effort, whereas the unlabeled data are often cheap and readily available. In such situation, traditional rough sets may be not applicable because of the scarcity of the labeled data. Therefore, it would be desirable to develop a novel approach for partially labeled data. This project, basing on the theory and method of rough sets, resorting to the technique of multi-views learning, and taking the problem of fabric defect classification as a practical application for verification, mainly focuses on the basic models, efficient algorithms, and application verification for partially labeled data. Firstly, the uncertainty representation of partially labeled data is investigated, and the generalized objective function of attribute reduction and optimized attribute reduction algorithms are designed. Secondly, with studying the theory of multi-granularity subspace learning, a fundamental model with diverse multi-granularity subspaces is developed and a basic framework for knowledge acquisition of partially labeled data is formed. Finally, on the basis of the above-mentioned research results, a verification system for fabric defect classification is built to demonstrate the correctness and effectiveness of the proposed model and algorithms. In a word, the research in this project will enrich the theories of rough sets and granular computing as well as their practical application.
粗糙集理论是粒计算研究领域的主要模型之一,主要针对有标记或无标记的不确定、不精确或不完备数据的知识获取。然而现实问题如信息检索、垃圾邮件处理和图像分类等,往往同时包含有标记和无标记数据(也即弱标记数据)。本项目以粗糙集理论为基础,以多视图学习技术为手段,以织物疵点图像分类为应用验证,从基本模型、优化算法与应用验证三个层次开展面向弱标记数据的知识获取方法研究。首先,针对弱标记数据研究不确性表示方法,设计代数和信息观点下泛化的属性约简目标函数并构建高效优化约简算法。其次,研究面向弱标记数据的多粒度空间学习理论,建立差异约简子空间的多粒度协同学习模型,形成弱标记数据的知识获取基础框架。最后,基于上述研究成果,研制织物疵点图像分类原型系统,以实证本项目提出的基本模型和核心算法的有效性。本项目的研究不仅能丰富粗糙集与粒计算知识获取理论体系,而且对理论与方法的实际应用具有促进作用。
粒计算是一种处理含糊、不确定或不精度数据的有效方法。然而,原有粒计算不确定性表示及度量主要针对有标记数据或无标记数据,不能直接应用于弱标记数据。本项目以粗糙集理论为基础,以多视图学习方法为技术手段,以纺织行业织物缺陷分类为验证平台,对弱标记数据的不确定性度量、属性约简、分类学习等问题进行系统化研究。针对弱标记数据的多粒度表示及不确定性度量问题,项目基于粗糙集和模糊集不确定性理论,提出了粒化最大决策熵,近似熵和信度差别矩阵等不确定性度量方法。针对弱标记数据的属性约简及优化问题,提出了利用代理标记的粒化条件熵和融合全局与局部信息的结构保持属性约简方法,建立了弱标记数据的属性约简目标函数,有效地实现了属性约简。针对弱标记数据的分类学习问题,项目基于多视图学习思想,通过分析弱标记数据粗糙子空间的冗余及差异特性,构建可利用无标记数据提升分类学习性能的三支协同分类模型及算法。此外,本项目建立了织物缺陷图像数据库及验证研究平台,并将弱标记数据的粒计算知识获取方法应用于织物缺陷分类问题。一方面验证提出理论、模型、算法的有效性,同时为织物缺陷分类问题提供了解决途径和方法。本项目的研究不仅丰富了粒计算知识获取理论体系,而且对弱监督学习理论研究及实际应用具有一定地促进作用。
{{i.achievement_title}}
数据更新时间:2023-05-31
涡度相关技术及其在陆地生态系统通量研究中的应用
粗颗粒土的静止土压力系数非线性分析与计算方法
环境类邻避设施对北京市住宅价格影响研究--以大型垃圾处理设施为例
中国参与全球价值链的环境效应分析
基于多模态信息特征融合的犯罪预测算法研究
面向复杂类型数据的粒计算方法、模型及其多属性群决策分析
面向网状结构数据的粒化方法、模型及其应用研究
基于多维数据模型的粒计算方法研究
物联网感知模型及弱可用数据条件下的计算方法研究