Emerging Storage devices with high performance and sophisticated features inevitably reform the conventional design principles and architecture of storage stack based on rotating disks. This storage stack heavily hinders the potential performance improvement of key-value store as a representative data-intensive application. Based on analyzing the characteristics of SSDs and understanding the relevant new design principles, we found that mainstream existing key-value stores based on multi-stage tree structure lead to significant write amplification, increasing the writing penalty of SSDs while hardly leveraging the advantages of high IOPS of SSDs. This project proposes a key-value store based direct storage, KVDS for short, and designs a novel architecture and the corresponding mechanisms for KVDS. KVDS highly combining the data process of key-value stores with the characteristics of SSDs. The main contributions are as follows: 1) to propose a multi-stage forest structure based key-value store suitable to SSDs, dramatically reducing the write amplification and effectively compensating the structure-inherited read degradation by using the parallel read to benefit from high IOPS of SSDs; 2) to present a direct storage architecture for key-value stores, which consists of application, direct storage, and media management layers, as well as to design the corresponding interface, data layout, and processing flow; 3) to develop key implementation techniques based on the new architecture, including physical storage mapping, internal and external channel scheduling of SSDs, compaction optimization, near data processing, etc. KVDS can maximize the overall performance of key-value stores under a wide range of workload patterns. The fundamental architecture and techniques of KVDS can be naturally extended to other big-data applications. Therefore, KVDS actually enriches the theory and key technology of computer system.
新型存储器件具有高性能和复杂内部结构,不可避免地动摇基于磁盘的传统存储栈设计原则及架构。现有存储栈整体上制约了以键值对存储为代表的数据密集型应用在新硬件上的性能发挥。分析发现基于多级树结构的主流键值对存储具有内生的显著写放大问题,既增加固态盘的写代价,也无法发挥其性能优势。本项目提出键值对直接存储架构和机制,深度融合键值对存储结构、系统存储过程和新型存储介质特性。其创新点为:1)提出适应固态盘的键值对多级森林结构,减少写放大,设计并行读机制,弥补其结构潜在的读性能下降;2)提出面向键值对存储的直接存储架构,构建应用、直接存储和介质管理三层架构,设计相应接口、数据布局和处理流程;3)开发新架构下的关键实现技术,包括存储物理映射,介质内外通道调度、合并过程优化、近数据处理等。直接存储能够最大化、全面提升键值对在各种负载模式下的性能,可推广到其他大数据应用存储中,从而丰富计算机系统理论及技术。
新型存储器件具有高性能和复杂内部结构,不可避免地动摇基于磁盘的传统存储栈设计原则及架构。现有存储栈整体上制约了以键值对存储为代表的数据密集型应用在新硬件上的性能发挥。分析发现主流键值对存储具有内生的显著写放大问题,既增加固态盘的写代价,也无法发挥其性能优势。本项目提出键值对直接存储架构和机制,深度融合键值对存储结构、系统存储过程和新型存储介质特性。其创新点为:1)提出了键值对直接存储架构,使得键值对应用能够直接管理块存储空间,避免文件系统引入的IO代价;2)提出适应固态盘的键值对多级森林结构,减少写放大,设计并行读机制,弥补其结构潜在的读性能下降;3)提出面向键值对存储的直接存储架构及相应接口、数据布局和处理流程;4)分析和理解现有存储系统、固态盘内部的多种异构性,设计相应的IO调度机制。项目共发表论文17篇(CCF-A/B 13篇),申请3项发明专利。键值对直接存储研究推进了新型键值对存储的研究和应用,所提技术思想也能推广到其他大数据应用。项目研究成果丰富了新型存储系统设计理论和实现技术。
{{i.achievement_title}}
数据更新时间:2023-05-31
玉米叶向值的全基因组关联分析
监管的非对称性、盈余管理模式选择与证监会执法效率?
低轨卫星通信信道分配策略
宁南山区植被恢复模式对土壤主要酶活性、微生物多样性及土壤养分的影响
针灸治疗胃食管反流病的研究进展
键值存储系统架构设计与性能优化研究
基于分布式键值对网络存储的消息传递程序重播技术研究
云计算环境下键值存储系统查询优化技术研究
基于纠删码的异构分布式内存键值存储系统构建及性能优化