大数据可信排序学习方法及其并行化研究

基本信息

批准号：61762052

项目类别：地区科学基金项目

资助金额：37.00

负责人：李金忠

学科分类：

依托单位：井冈山大学

批准年份：2017

结题年份：2021

起止时间：2018-01-01 - 2021-12-31

项目状态：已结题

项目参与者：夏洁武,卜登立,刘欢,谭云兰,魏韡,曾劲涛,彭蕾

关键词：

排序学习并行学习多目标智能优化算法粒计算可信性

结项摘要

Learning to rank is a central problem of information retrieval, machine learning and data mining, which takes an important role in the area of search engine. Current related research works focus on relevance learning to rank, often ignore the credibility of web information. The approaches of learning to rank based on effectiveness are less efficient in dealing with big data, we must seek some efficient parallel learning to rank approaches to adapt to the big data environment. Targeting at the above-mentioned observations, this project will study the approaches and its parallelization of the credibility learning to rank to solve the problems of credibility and efficiency of the credibility learning to rank for big data by using different theories and methods comprehensively such as big data processing, web spam detection, granular computing, multi-objective intelligent optimization and multiple attribute decision making. Main contents include: 1) extracting and measuring these ranking features of relevance, credibility and incredibility, and constructing a big data of the credibility learning to rank, and studying clustering algorithm of queries based on granular computing; 2) studying the approaches of the credibility learning to rank for big data based on multi-objective optimization model of the credibility leaning to rank and multi-objective intelligent optimization algorithms; 3) parallelizing the proposed approaches of the credibility learning to rank for big data in the framework of Spark. The research results can provide a new model and new approaches for learning to rank, and provide new ideas for the research on the credibility of ranking results and the efficiency of learning to rank for big data, and can be applied in search engines.

排序学习是信息检索、机器学习和数据挖掘的一个中心问题，它在搜索引擎中占有重要地位。现有相关工作重在相关性排序学习，往往忽略了网页信息的可信性。单纯以效果为中心的排序学习方法在处理大数据时效率较低，须寻求适应大数据环境的高效并行排序学习方法。本项目拟综合应用大数据处理、web spam检测、粒计算、多目标智能优化和多属性决策等理论与方法，研究大数据可信排序学习方法及其并行化，解决大数据可信排序学习的可信和效率问题。具体内容包括：1)提取和度量相关性、可信性和不可信性排序特征，构建可信排序学习大数据，研究基于粒计算的查询聚类算法；2)以可信排序学习多目标优化模型和多目标智能优化算法为基础，研究大数据可信排序学习方法；3)在Spark框架下，研究2)中方法的并行化问题。研究成果可为排序学习提供新模型和新方法，为大数据排序结果的可信性和排序学习效率的研究提供新思路，并能在搜索引擎中得到应用。

项目摘要

排序学习是信息检索和机器学习领域交叉的一个研究热点，它在搜索引擎和推荐系统中占有重要地位。本项目基于多目标智能优化算法等技术，探究了大数据可信排序学习方法及其并行化，以增强排序模型的可信性和排序学习的效率。详细综述了信息检索与机器学习中排序学习以及大数据中大规模图计算系统的研究进展，构建了排序学习的多目标优化模型,基于偏差-方差均衡理论，提出了一种基于多目标粒子群优化的鲁棒性排序学习方法，基于马太效应思想和学习率的变化策略改进了LambdaMART排序学习方法，改进了一种带拥挤距离的多目标粒子群优化算法并设计了大数据环境下的基于改进的带拥挤距离的多目标粒子群优化算法的可信排序学习方法及其并行方法，设计了归档式多目标模拟退火算法的并行化并基于此设计了一种基于Spark和归档式多目标模拟退火算法的大数据可信并行排序学习方法，提出了一种融合多头自注意力机制和条件生成对抗网络的排序学习方法，开发了基于多目标粒子群优化的排序学习系统和基于Hooke & Jeeves模式搜索的排序学习系统。本项目的研究为排序学习提供了新模型和新方法，为大数据排序学习的可信性和效率的研究提供了新思路。

项目成果

DOI：{{i.doi}}

发表时间：{{i.publish_year}}

暂无此项成果

数据更新时间：2023-05-31

其他相关文献

DOI：

发表时间：2021

DOI：10.1051/jnwpu/20213920292

发表时间：2021

DOI：10.3778/j.issn.1002-8331.1903-0411

发表时间：2020

DOI：10.19328/j.cnki.2096-8655.2022.02.002

发表时间：2022

DOI：10.13609/j.cnki.1000-0313.2022.04.019

发表时间：2022

李金忠的其他基金

相似国自然基金

组排序学习方法的研究与应用

批准号：61402075

批准年份：2014

负责人：林原

学科分类：F0211

资助金额：24.00

项目类别：青年科学基金项目

海量高维天体光谱数据挖掘及其并行化研究

批准号：61272263

批准年份：2012

负责人：张继福

学科分类：F0607

资助金额：80.00

项目类别：面上项目

基于GPU的并行排序算法设计与优化

批准号：61073008

批准年份：2010

负责人：都志辉

学科分类：F0204

资助金额：36.00

项目类别：面上项目

并行子空间学习方法及其大规模图像识别应用研究

批准号：61272273

批准年份：2012

负责人：荆晓远

学科分类：F0605

资助金额：82.00

项目类别：面上项目

大数据可信排序学习方法及其并行化研究

{{i.achievement_title}}

暂无此项成果

其他相关文献

基于铁路客流分配的旅客列车开行方案调整方法

一种基于多层设计空间缩减策略的近似高维优化方法

新型树启发式搜索算法的机器人路径规划

"多对多"模式下GEO卫星在轨加注任务规划

结直肠癌免疫治疗的多模态影像及分子影像评估

李金忠的其他基金

相似国自然基金