大数据确定性近似查询算法关键技术研究

基本信息

批准号：61872106

项目类别：面上项目

资助金额：63.00

负责人：韩希先

学科分类：

依托单位：哈尔滨工业大学

批准年份：2018

结题年份：2022

起止时间：2019-01-01 - 2022-12-31

项目状态：已结题

项目参与者：王金宝,张开旗,李发明,李雪,宋翠

关键词：

确定性近似查询存储结构大数据近似度保证算法

结项摘要

Faced with big data with explosive growth rate, current query algorithms cannot return the exact query result in an acceptable response time, which affects the query interaction and users’ productivity severely. Thus, approximate query processing has become an important research issue in big data analytics. It is analyzed that most of the existing approximate query algorithms are focused on basic queries, some of them are based on the regularity assumption and some have the problem of poor scalability. Therefore they cannot process approximate query on big data in general cases efficiently. This proposal proposes a new type of approximate query, i.e. deterministic approximate query, which returns the approximate results satisfying users’ requirement with the deterministic approximation. This proposal will research on the key techniques of deterministic approximate query algorithm on big data, striving for the optimal trade-off between the approximation degree and execution cost. The proposal aims to study the mathematical abstraction and storage structure of deterministic approximate query on big data, deterministic approximate basic query algorithms on big data, and deterministic approximate complex query algorithms on big data. This proposal will propose a series of theories, techniques and methods about deterministic approximate query on big data. Finally, the prototype system of deterministic approximate query on big data will be developed to evaluate the validity and efficiency of the algorithms proposed in this proposal.

数据的爆炸性增长使得现有查询算法无法快速返回大数据上的准确查询结果，严重影响查询交互性和用户工作效率，因此近似查询已成为目前大数据查询处理的一个重要研究问题。通过分析发现，现有的大多数近似查询算法在实际应用中集中于基本查询，并且或者依赖于底层数据的正则性分布，或者存在扩展性较差的问题，无法有效处理大数据在一般情况下的近似查询问题。本项目提出一类新的近似查询类型，即确定性近似查询，该查询以确定性近似度返回满足用户要求的近似结果。本项目拟以查询结果的确定性近似度和执行代价的优化折衷为目标，研究大数据确定性近似查询算法的关键技术，包括大数据确定性近似查询的数学抽象及存储结构、大数据确定性近似基本查询算法、大数据确定性近似复杂查询算法，拟提出一系列有关大数据确定性近似查询算法的理论、技术和方法，并实现大数据确定性近似查询原型系统，验证本项目所提出方法的正确性和有效性。

项目摘要

大数据的出现使得数据驱动的决策方法成为目前商业、科学甚至政府执行决策的主要方法，现有查询算法无法快速返回大数据上的准确查询结果，严重影响查询交互性和用户工作效率，近似查询处理技术正成为目前大数据计算的热点研究问题。为解决现有近似查询算法不能有效解决大数据近似查询的问题，本项目主要研究大数据确定性近似查询算法的关键技术，针对具体的近似查询应用设计有效的大数据确定性近似查询算法，提出互补抽样、裁剪策略、早结束策略、综合索引结构、计算重用方法、多层数据概要、条件生成模型、代表性结果选择、近似压缩等关键方法来解决大数据确定性近似查询算法的性能问题。在本项目支持下，项目组聚集在大数据确定性近似查询算法研究，分别在轨迹大数据近似最大范围和查询、不完整大数据近似skyline查询、大规模SIOT网络数据近似社交空间关键词搜索、大数据近似G-Skyline查询、大数据top/bottom k分数查询估计方法、大数据高效近似查询处理框架等方面取得较大研究进展，已发表高水平论文14篇，授权专利4项。本项目的研究成果表明，课题组提出的近似查询算法比现有方法，无论在执行时间、内存消耗和磁盘费用方面，都表现出较大的性能优势，在大数据上可以高效返回用户需要的查询结果。

项目成果

DOI：{{i.doi}}

发表时间：{{i.publish_year}}

暂无此项成果

数据更新时间：2023-05-31

其他相关文献

DOI：

发表时间：2021

DOI：10.13197/j.eeev.2019.05.95.fuwq.009

发表时间：2019

DOI：10.6041/j.issn.1000-1298.2022.07.022

发表时间：2022

DOI：10.3778/j.issn.1002-8331.1903-0411

发表时间：2020

DOI：10.19328/j.cnki.2096-8655.2022.02.002

发表时间：2022

韩希先的其他基金

批准号：61402130

批准年份：2014

资助金额：24.00

项目类别：青年科学基金项目

相似国自然基金

大数据偏好查询算法关键技术研究

批准号：61402130

批准年份：2014

负责人：韩希先

学科分类：F0202

资助金额：24.00

项目类别：青年科学基金项目

海量高维不确定性数据的高效查询关键技术研究

批准号：61003074

批准年份：2010

负责人：庄毅

学科分类：F0202

资助金额：20.00

项目类别：青年科学基金项目

面向XML数据的关键字查询算法辅助生成技术研究

批准号：61272124

批准年份：2012

负责人：陈子阳

学科分类：F0202

资助金额：80.00

项目类别：面上项目

基于近似关键字的大规模空间数据查询与处理

批准号：61202025

批准年份：2012

负责人：姚斌

学科分类：F0202

资助金额：25.00

项目类别：青年科学基金项目

大数据确定性近似查询算法关键技术研究

{{i.achievement_title}}

暂无此项成果

其他相关文献

基于铁路客流分配的旅客列车开行方案调整方法

基于被动变阻尼装置高层结构风振控制效果对比分析

基于改进LinkNet的寒旱区遥感图像河流识别方法

新型树启发式搜索算法的机器人路径规划

"多对多"模式下GEO卫星在轨加注任务规划

韩希先的其他基金

大数据偏好查询算法关键技术研究

相似国自然基金