Availability is one of the key issues in IaaS platforms. It limits the healthy development and rapid promotion of IaaS platforms, since the system unavailability results in service disruption which further leads to huge economic losses. Failure recovery technology, which resumes the system from failures, is an effective approach to enhance system availability. Thus, it has been used widely in both traditional computing environments and emerging IaaS platforms. Unfortunately, the nowadays IaaS platforms are always large-scaled and complicated in architecture design and implementation, so that the failures become common, introducing new problems and challenges to traditional failure recovery technologies. This project, taking advantage of the life-cycle of virtual machine snapshot, attempts to improve failure recovery technologies based on three critical points including snapshot creation, storage of snapshots, and rollback recovery. Specifically, we firstly formulate a failure recovery model based on the life-cycle of virtual machine snapshot. Based on this model, we pay attention to the virtual cluster computing paradigm, and conduct research on three key techniques: continual creation of distributed snapshots for virtual clusters, high availability storage system for snapshot files, and rapid rollback recovery of virtual clusters. Finally, we develop corresponding prototype systems, and verify the proposed techniques on our IaaS platform. This project will provide effective theoretical foundation and technical support to snapshot based failure recovery systems, so as to enhance the availability of IaaS platform. Additionally, we believe that the research results of this project will help to promote the healthy development of the cloud computing platform as well as its services such as big data, financial business and e-government.
可用性是IaaS云计算平台中的核心问题,其制约着IaaS平台的健康发展和应用推广。失效恢复技术是增强系统可用性的有效手段,然而,IaaS平台因规模庞大且结构复杂导致系统失效频发,给失效恢复技术带来了新的问题和挑战。本项目以快照生命周期中创建、存储、回滚三个阶段为切入点,开展基于虚拟机快照的失效恢复技术研究。具体内容包括:基于快照创建、存储及回滚构建失效恢复技术体系模型,在此基础上,重点针对虚拟集群计算形态,在面向虚拟集群的分布式持续快照创建、面向快照文件的高可用存储、面向虚拟集群的快速回滚三项关键技术方面开展研究,开发相应原型系统,并依托我们已建立的IaaS实验平台进行系统集成,结合体系模型开展关键技术的评测和验证。本项目可为基于快照的失效恢复系统研制提供理论基础和技术支撑,为IaaS平台的可用性增强提供技术思路,对于推动云计算平台及其业务的健康发展具有重要的应用价值。
可用性是IaaS云计算平台中的核心问题,其制约着IaaS平台的健康发展和应用推广。失效恢复技术是增强系统可用性的有效手段,然而,IaaS平台因规模庞大且结构复杂导致系统失效频发,给失效恢复技术带来了新的问题和挑战。本项目以快照生命周期中创建、存储、回滚三个阶段为切入点,开展了基于虚拟机快照的失效恢复技术研究,并依托我们已建立的IaaS实验平台进行原型系统开发和评测。项目执行期间,项目成员重点针对虚拟集群计算形态,提出了虚拟机快照持续创建技术,结合后拷贝技术、惰性增量记录技术以及若干优化方法,将快照间隔减小到秒级,实现了高频快照;面向大量快照文件提出了高可用存储技术,结合页面语义自省技术和页类型感知的副本策略,在不损失可用性的前提下减小近70%的存储开销;面向虚拟集群提出了快速恢复技术,结合冗余页面组播技术和流量感知的虚拟集群放置策略,减小约40%的数据传输开销和时间延迟。本项目取得的研究成果可为基于快照的失效恢复系统研制提供理论基础和技术支撑,为IaaS平台的可用性增强提供技术思路,对于推动云计算平台及其业务的健康发展具有重要的应用价值。
{{i.achievement_title}}
数据更新时间:2023-05-31
涡度相关技术及其在陆地生态系统通量研究中的应用
硬件木马:关键问题研究进展及新动向
小跨高比钢板- 混凝土组合连梁抗剪承载力计算方法研究
内点最大化与冗余点控制的小型无人机遥感图像配准
端壁抽吸控制下攻角对压气机叶栅叶尖 泄漏流动的影响
基于虚拟机自省的云安全防护关键技术研究
基于积分视场和孔径分割的快照式偏振光谱成像技术研究
面向版本授权的快照数据安全存储技术研究
超密集网络中基于服务器虚拟机的多点协作技术研究