基于语义距离的分布式数据挖掘理论与方法

基本信息

批准号：71271076

项目类别：面上项目

资助金额：55.00

负责人：刘滨

学科分类：

依托单位：河北科技大学

批准年份：2012

结题年份：2016

起止时间：2013-01-01 - 2016-12-31

项目状态：已结题

项目参与者：张晓明,胡晓林,刘紫玉,李冬梅,许云峰,刘涛,郭丽娟,朱星玮,马韶梅

关键词：

语义距离本体分布式数据挖掘复合量化智能计算

结项摘要

In this project, personalized recommendation and sales forecasting in e-commerce are considered as the engineering background and oriented application. Around the issues in the construction and solution of distributed data mining (DDM) model, this research starts with the motivation of eliminating quality risks produced by independent mining in a semantic segmentation way. We will utilize multi-strategy ontology matching to build a compound quantization architecture for measuring the semantic distance from local to the whole between data source ontologies. With the architecture, the essential semantic difference between data sources can be found, and the hierarchical data mining architecture will be set up sequentially. Secondly, we will develop the quality inspection method for hierarchically filtering the intermediate results, knowledge integration model, and load balancing mechanisms based on layer-unit. Then, from a structural point of view, a workable hierarchical DDM model will be proposed, which focuses on the quality as well as the efficiency. To provide a solution of the DDM model, the intelligent computing method which can integrate multiple algorithms (neural network,genetic algorithm, etc.) will be designed; the web service library and agent-oriented service composition model will be built sequentially. The multi-agent mechanism will be designed based on JAFMAS framework. And the human-computer interaction mechanism for increasing user participation during DDM will also be built to improve the semantic understanding of DDM process and results. Finally, the validity of the model and algorithms will be verified with specific cases. The research can enrich and improve the current DDM models and methods, and has promising application prospects in the fields such as e-commerce personalized recommendation and sales forecasting.

以电子商务个性化推荐、销售预测问题为工程背景和应用面向，围绕分布式数据挖掘(DDM)模型的构建和求解问题，从消除语义分割式独立挖掘的质量隐患入手，综合多种本体匹配策略，建立从局部到整体度量数据源间语义距离的复合量化体系，提炼出数据源间的语义本质差异，据之建立数据源分组的层次化挖掘体系；进而研究层次式筛检结果的质量考察方法和知识整合模型、以层为资源单位的负载平衡机制；继而从结构化的角度构建具有可操作性的、侧重质量兼顾效率的层次化DDM模型。针对模型求解,设计多算法（神经网络、遗传算法等）集成的智能计算方法，构建Web服务库和Agent主导的服务组合模型；并基于JAFMAS框架设计多Agent工作机制；建立强化语义理解挖掘过程和结果的、能提高用户参与度的人机交互机制。最后，结合具体案例验证模型和算法的有效性。本研究将丰富和完善DDM理论和方法，在电子商务个性化推荐、销售预测等领域应用前景广泛。

项目摘要

本项目采用学科交叉的手段，以数据源间语义距离的度量为突破口，以数据挖掘和本体理论为基础，构建和求解了新型的数据挖掘方法。首先，解决了模块化本体构建和模块化本体协同进化问题，利用本体描述数据源的语义特征；进而，基于本体匹配技术构建了数据源间语义距离的复合量化体系，根据度量结果分组数据源，并依次构建了层次化的挖掘模型、知识整合模型和负载平衡机制；最终，形成了具有可操作性的分布式数据挖掘方法，并结合具体案例进行了实验验证和仿真分析。本项目已完成计划任务书中的研究内容，在《Knowledge-Based Systems》、《Advanced Engineering Informatics》、《Biotechnology: An Indian Journal》、《ICIC Express Letters Part B: Applications》、《International Conference on Machine Learning and Cybernetics》、《International Conference on Network and Information Systems for Computers》、《International Industrial Informatics and Computer Engineering Conference》等国内外重要学术期刊和国际会议上发表相关学术论文20篇，其中有2篇被SCI收录、11篇被EI收录、1篇被ISTP收录。在本项目的支持下，团队2013年筹建了河北科技大学大数据与社会计算研究中心，该中心2015年入选了河北省政府云计算创新能力提升工程；团队积极推动科研成果转化，研发了项目成果科普展示平台--慧瞳世纪，利用数据可视化方法向公众免费提供社交网络挖掘、新闻挖掘、舆情分析、日志挖掘、GIS等数据挖掘成果；这些成果在团队与河北慧聪、河北广联、河北省大气污染防治技术研究推广中心、河北省高速公路管理局、新华网河北分公司、中科软等单位的合作项目中也得到推广和应用。项目组两名教师2015年入选“河北省三三三人才”，培养研究生10名，毕业9名，在读1名；优秀硕士毕业生1名（荣获研究生国家奖学金二等），青年教师2名。

项目成果

DOI：{{i.doi}}

发表时间：{{i.publish_year}}

暂无此项成果

数据更新时间：2023-05-31

其他相关文献

DOI：10.11862/CJIC.2019.081

发表时间：2019

DOI：10.13609/j.cnki.1000-0313.2022.04.019

发表时间：2022

DOI：10.13199/j.cnki.cst.2020.07.010

发表时间：2020

DOI：10.12005/orms.2019.0029

发表时间：2019

DOI：10.7536/pc200335

发表时间：2020

刘滨的其他基金

批准号：81902531

批准年份：2019

资助金额：21.00

项目类别：青年科学基金项目

批准号：61672184

批准年份：2016

资助金额：62.00

项目类别：面上项目

批准号：51474205

批准年份：2014

资助金额：85.00

项目类别：面上项目

批准号：71303100

批准年份：2013

资助金额：22.00

项目类别：青年科学基金项目

批准号：71563016

批准年份：2015

资助金额：30.00

项目类别：地区科学基金项目

批准号：30901111

批准年份：2009

资助金额：18.00

项目类别：青年科学基金项目

批准号：41102198

批准年份：2011

资助金额：25.00

项目类别：青年科学基金项目

批准号：61300112

批准年份：2013

资助金额：23.00

项目类别：青年科学基金项目

相似国自然基金

基于集成学习的分布式XML数据流的挖掘模型与概念漂移挖掘方法研究

批准号：61773415

批准年份：2017

负责人：毛国君

学科分类：F0603

资助金额：64.00

项目类别：面上项目

面向轨迹大数据的语义标注与语义模式挖掘算法研究

批准号：61773331

批准年份：2017

负责人：于彦伟

学科分类：F0603

资助金额：65.00

项目类别：面上项目

数据挖掘中的凸规划理论与方法

批准号：10601064

批准年份：2006

负责人：田英杰

学科分类：A0405

资助金额：16.00

项目类别：青年科学基金项目

基于信息系统同态理论的混合数据挖掘理论与方法研究

批准号：61070242

批准年份：2010

负责人：王长忠

学科分类：F0607

资助金额：31.00

项目类别：面上项目

基于语义距离的分布式数据挖掘理论与方法

{{i.achievement_title}}

暂无此项成果

其他相关文献

中温固体氧化物燃料电池复合阴极材料LaBiMn_2O_6-Sm_(0.2)Ce_(0.8)O_(1.9)的制备与电化学性质

结直肠癌免疫治疗的多模态影像及分子影像评估

智能煤矿建设路线与工程实践

基于直觉模糊二元语义交互式群决策的技术创新项目选择

近红外光响应液晶弹性体

刘滨的其他基金

转录因子HMGA2介导NR2F2对SHH型髓母细胞瘤的影响的机制研究

基于自然语言处理语义分析技术的蛋白质远同源性检测和折叠识别

开采卸压条件下深部富含瓦斯煤岩体劣化失稳机理及仿真方法

农业补贴政策实施绩效与政策优化研究--基于不同资源禀赋农户视角

生态公益林补偿政策实施绩效与政策优化路径研究--基于林农视角

下丘脑神经肽（Orexin）及其受体调控大菱鲆仔鱼内-外源营养转换的机制研究

基于最小功耗原理的岩爆量化预报方法研究

基于序列谱进化信息的蛋白质远程同源性检测方法研究

相似国自然基金