The Internet era has brought e-commerce retail companies a large number of real-time customer transaction data that contain massive information and high commercial value. Nevertheless, as the feature dimension gets higher and most data are not manually marked, it is difficult for mainstream methods to fast and effectively identify customers and realize accurate customer classification, and hence it poses a significant challenge to the development of recommendation system. In this project, customer recommendation in the context of big data is investigated. Addressing the problem of high dimension complexity in the classification and prediction problem, a series of bacteria-inspired feature selection methods are proposed. Those methods combine unsupervised learning and self-adaptive learning to process big data. Specifically, to solve the problem of insufficient number of manually marked samples and severe redundancy in the high-dimensional feature data, a classification model for feature selection with integrated semi-supervised learning is proposed, and a series of bacteria-inspired feature selection methods with unsupervised rules and self-adaptive learning strategies are designed. By these methods, the most statistically significant feature combinations in the high-dimensional data are extracted as the optimum decision variables. Then, an accurate unsupervised prediction system of customer classification is constructed, to realize customer identification, knowledge extraction, and personalized intelligent product recommendation. This study not only enriches and develops both theory and practical methods of Swarm Intelligence in solving problem in the big data industry, but also provides enterprise with scientific approach towards realizing personalized accurate recommendation, and improving both management efficiency and customer service quality.
互联网时代给电商零售企业带来了大量实时的客户交易数据,这些数据虽然信息量大、商业价值高,但与日剧增的特征维数和大多未经人工标记的样本,导致主流的方法很难快速有效地对客户进行识别和精准分类,给推荐系统带来巨大挑战。本项目以大数据环境下的客户推荐为研究对象,针对分类预测问题的高维复杂性,提出基于半监督细菌启发式的特征选择方法,通过无监督规则和自适应学习处理大数据。具体包括:针对人工标记样本不足且高维特征数据严重冗余的问题,本研究提出集成半监督学习的特征选择分类模型,设计一系列具有无监督规则和自适应学习的细菌启发式特征选择方法,挖掘高维数据集中最具统计意义的特征组合作为最优决策变量,构建精准的半监督客户分类系统,实现客户识别、知识抽取和产品的个性化智能推荐。该研究不仅丰富和发展了群体智能在大数据产业实际应用问题的理论与方法,也为企业实现个性化精准推送、提高运营效率和客户服务质量提供了科学方法。
本项目以大数据环境下的客户分类为研究对象,针对分类预测问题的高维复杂性,提出基于半监督细菌启发式的特征选择方法,通过无监督规则和自适应学习处理大数据。具体研究内容:(1)针对人工标记样本不足且高维特征数据严重冗余的问题,本研究提出集成半监督学习的特征选择分类模型;(2)为了更好的解决特征选择问题中的局部搜索和全局搜索,基于细菌启发式优化算法构建生物行为仿真模型和优化策略;(3)设计一系列具有无监督规则和自适应学习的细菌启发式特征选择方法,挖掘高维数据集中最具统计意义的特征组合作为最优决策变量;(4)提出完整的且精准的半监督客户分类方法,实现客户识别、知识抽取和产品的个性化智能推荐。以上研究不仅丰富和发展了群体智能在大数据产业实际应用问题的理论与方法,也为企业实现个性化精准推送、提高运营效率和客户服务质量提供了科学方法。项目执行期间,取得的研究成果包括专著2部,学术论文17篇,其中期刊论文6篇(五篇英文期刊论文为SCI JCR Q1检索,一篇中文发表在核心期刊为CSSCI检索),EI检索论文11篇;获会议组织最佳论文奖2项,最佳专题论文奖1项;组织学术分会2次,参与学术会议主题分享5次,指导学生参与创新创业及学术竞赛获省级以上奖项4次。培养(含共同培养)硕士研究生12名(4人已毕业),博士研究生2名。
{{i.achievement_title}}
数据更新时间:2023-05-31
论大数据环境对情报学发展的影响
监管的非对称性、盈余管理模式选择与证监会执法效率?
主控因素对异型头弹丸半侵彻金属靶深度的影响特性研究
基于公众情感倾向的主题公园评价研究——以哈尔滨市伏尔加庄园为例
水氮耦合及种植密度对绿洲灌区玉米光合作用和干物质积累特征的调控效应
大数据环境下基于GMDH的客户分类半监督集成模型研究
基于半监督学习和集成学习的文本分类方法研究
水中目标特征增强及深度半监督分类方法研究
半监督文本情感分类方法研究