In this project, personalized recommendation and sales forecasting in e-commerce are considered as the engineering background and oriented application. Around the issues in the construction and solution of distributed data mining (DDM) model, this research starts with the motivation of eliminating quality risks produced by independent mining in a semantic segmentation way. We will utilize multi-strategy ontology matching to build a compound quantization architecture for measuring the semantic distance from local to the whole between data source ontologies. With the architecture, the essential semantic difference between data sources can be found, and the hierarchical data mining architecture will be set up sequentially. Secondly, we will develop the quality inspection method for hierarchically filtering the intermediate results, knowledge integration model, and load balancing mechanisms based on layer-unit. Then, from a structural point of view, a workable hierarchical DDM model will be proposed, which focuses on the quality as well as the efficiency. To provide a solution of the DDM model, the intelligent computing method which can integrate multiple algorithms (neural network,genetic algorithm, etc.) will be designed; the web service library and agent-oriented service composition model will be built sequentially. The multi-agent mechanism will be designed based on JAFMAS framework. And the human-computer interaction mechanism for increasing user participation during DDM will also be built to improve the semantic understanding of DDM process and results. Finally, the validity of the model and algorithms will be verified with specific cases. The research can enrich and improve the current DDM models and methods, and has promising application prospects in the fields such as e-commerce personalized recommendation and sales forecasting.
以电子商务个性化推荐、销售预测问题为工程背景和应用面向,围绕分布式数据挖掘(DDM)模型的构建和求解问题,从消除语义分割式独立挖掘的质量隐患入手,综合多种本体匹配策略,建立从局部到整体度量数据源间语义距离的复合量化体系,提炼出数据源间的语义本质差异,据之建立数据源分组的层次化挖掘体系;进而研究层次式筛检结果的质量考察方法和知识整合模型、以层为资源单位的负载平衡机制;继而从结构化的角度构建具有可操作性的、侧重质量兼顾效率的层次化DDM模型。针对模型求解,设计多算法(神经网络、遗传算法等)集成的智能计算方法,构建Web服务库和Agent主导的服务组合模型;并基于JAFMAS框架设计多Agent工作机制;建立强化语义理解挖掘过程和结果的、能提高用户参与度的人机交互机制。最后,结合具体案例验证模型和算法的有效性。本研究将丰富和完善DDM理论和方法,在电子商务个性化推荐、销售预测等领域应用前景广泛。
本项目采用学科交叉的手段,以数据源间语义距离的度量为突破口,以数据挖掘和本体理论为基础,构建和求解了新型的数据挖掘方法。首先,解决了模块化本体构建和模块化本体协同进化问题,利用本体描述数据源的语义特征;进而,基于本体匹配技术构建了数据源间语义距离的复合量化体系,根据度量结果分组数据源,并依次构建了层次化的挖掘模型、知识整合模型和负载平衡机制;最终,形成了具有可操作性的分布式数据挖掘方法,并结合具体案例进行了实验验证和仿真分析。本项目已完成计划任务书中的研究内容,在《Knowledge-Based Systems》、《Advanced Engineering Informatics》、《Biotechnology: An Indian Journal》、《ICIC Express Letters Part B: Applications》、《International Conference on Machine Learning and Cybernetics》、《International Conference on Network and Information Systems for Computers》、《International Industrial Informatics and Computer Engineering Conference》等国内外重要学术期刊和国际会议上发表相关学术论文20篇,其中有2篇被SCI收录、11篇被EI收录、1篇被ISTP收录。在本项目的支持下,团队2013年筹建了河北科技大学大数据与社会计算研究中心,该中心2015年入选了河北省政府云计算创新能力提升工程;团队积极推动科研成果转化,研发了项目成果科普展示平台--慧瞳世纪,利用数据可视化方法向公众免费提供社交网络挖掘、新闻挖掘、舆情分析、日志挖掘、GIS等数据挖掘成果;这些成果在团队与河北慧聪、河北广联、河北省大气污染防治技术研究推广中心、河北省高速公路管理局、新华网河北分公司、中科软等单位的合作项目中也得到推广和应用。项目组两名教师2015年入选“河北省三三三人才”,培养研究生10名,毕业9名,在读1名;优秀硕士毕业生1名(荣获研究生国家奖学金二等),青年教师2名。
{{i.achievement_title}}
数据更新时间:2023-05-31
论大数据环境对情报学发展的影响
一种光、电驱动的生物炭/硬脂酸复合相变材料的制备及其性能
基于公众情感倾向的主题公园评价研究——以哈尔滨市伏尔加庄园为例
基于ESO的DGVSCMG双框架伺服系统不匹配 扰动抑制
物联网中区块链技术的应用与挑战
基于集成学习的分布式XML数据流的挖掘模型与概念漂移挖掘方法研究
面向轨迹大数据的语义标注与语义模式挖掘算法研究
数据挖掘中的凸规划理论与方法
基于信息系统同态理论的混合数据挖掘理论与方法研究