大数据偏好查询算法关键技术研究

基本信息

批准号：61402130

项目类别：青年科学基金项目

资助金额：24.00

负责人：韩希先

学科分类：

依托单位：哈尔滨工业大学

批准年份：2014

结题年份：2017

起止时间：2015-01-01 - 2017-12-31

项目状态：已结题

项目参与者：刘显敏,王金宝,苗东菁,郑旭

关键词：

有序列表偏好查询剪切操作大数据算法

结项摘要

The efficient handling of preference query in big-data applications is becoming an increasingly important issue nowadays. The data explosion makes it difficult for users to find the data they really want. When performed on big data, traditional Boolean database answer model often encounters either of two problems: empty-answer and too-many-answers. By information filtering and extraction, preference query is utilized to reduce the returned data volume significantly and help users find the valuable data, which is of great academic and practical value. It is found that the existing algorithms for preference query are only suitable for medium or small data and will incur high execution cost on big data. Furthermore, there is very few related work on approximate preference query. This proposal mainly considers the key techniques of preference query processing on big data, including theoretical foundation, exact algorithms, approximate algorithms and online algorithms for preference query on big data. The proposal will analyze the mathematics abstraction and complexity of preference query, develop efficient pruning rules to discard the candidates which do not belong to final results according to characteristic of concrete preference query, consider the relation between the error and execution behavior in approximate query and make good use of the promotion space of performance provided by the specified errors. A protosystem of preference query on big data will be developed to evaluate the validity and efficiency of the algorithms proposed in this proposal.

在大数据应用中，如何有效执行偏好查询正在成为一个越来越重要的问题。数据爆炸使得人们很难找到自己真正想要的数据，传统的布尔数据库查询模型在大数据上执行时经常遇到以下问题：空集或过多候选结果。偏好查询通过信息过滤和信息抽取有效减少返回的数据量，帮助用户找到真正有价值的数据，具有较大的学术和实用价值。我们发现，现有的偏好查询算法只适用于中小规模数据，在大数据上会引起较大的执行费用，而且现有关于近似偏好查询的研究工作还很少。为此，本项目主要研究大数据偏好查询算法的关键技术，包括大数据偏好查询的理论基础、准确算法、近似算法和在线算法，拟分析大数据偏好查询的数学抽象和复杂性结果，根据具体偏好查询的特点设计有效的剪切规则来丢弃不属于查询结果的候选元组，在近似偏好查询中考虑误差度和执行行为的关系，从而较好地利用给定误差提供的性能提高空间，并实现大数据偏好查询的原型系统来验证本项目研究成果的正确性和有效性。

项目摘要

数据爆炸使得人们很难找到自己真正想要的数据，传统的布尔数据库查询模型在大数据上执行时经常遇到以下问题：空集或过多候选结果。偏好查询通过信息过滤和信息抽取有效减少返回的数据量，帮助用户找到真正有价值的数据，具有较大的学术和实用价值。现有的偏好查询算法大都只考虑中小规模的数据集，在大数据上，现有算法都存在执行效率较差的问题。本项目考虑大数据上的偏好查询算法的关键技术。在本项目的支持下，课题组共发表包括数据库顶级国际期刊和CCF A类会议在内的高质量学术论文10篇。研究成果表明，课题组提出的大数据偏好查询算法比现有方法，无论在执行时间、内存消耗和磁盘费用方面，都表现出较大的性能优势。在商业决策和分析过程中，作为实现个性化查询的重要手段，偏好查询处理对于企业决策人员及时掌握市场动向从而及时做出正确的商业决策起着至关重要的作用，本项目的研究成果将在实际生产中体现其科研和经济价值。

项目成果

DOI：{{i.doi}}

发表时间：{{i.publish_year}}

暂无此项成果

数据更新时间：2023-05-31

其他相关文献

DOI：

发表时间：2021

DOI：10.3778/j.issn.1002-8331.1903-0411

发表时间：2020

DOI：10.19328/j.cnki.2096-8655.2022.02.002

发表时间：2022

DOI：10.13199/j.cnki.cst.2020.07.010

发表时间：2020

DOI：10.7498/aps.70.20202116

发表时间：2021

韩希先的其他基金

批准号：61872106

批准年份：2018

资助金额：63.00

项目类别：面上项目

相似国自然基金

大数据确定性近似查询算法关键技术研究

批准号：61872106

批准年份：2018

负责人：韩希先

学科分类：F0202

资助金额：63.00

项目类别：面上项目

面向XML数据的关键字查询算法辅助生成技术研究

批准号：61272124

批准年份：2012

负责人：陈子阳

学科分类：F0202

资助金额：80.00

项目类别：面上项目

位置服务中隐私偏好查询与隐藏关键技术研究

批准号：61370077

批准年份：2013

负责人：倪巍伟

学科分类：F0202

资助金额：75.00

项目类别：面上项目

面向位置偏好查询的移动P2P数据库构建及算法研究

批准号：61303049

批准年份：2013

负责人：杨婧

学科分类：F0202

资助金额：23.00

项目类别：青年科学基金项目

大数据偏好查询算法关键技术研究

{{i.achievement_title}}

暂无此项成果

其他相关文献

基于铁路客流分配的旅客列车开行方案调整方法

新型树启发式搜索算法的机器人路径规划

"多对多"模式下GEO卫星在轨加注任务规划

智能煤矿建设路线与工程实践

非牛顿流体剪切稀化特性的分子动力学模拟

韩希先的其他基金

大数据确定性近似查询算法关键技术研究

相似国自然基金