An architectural simulator is always confronted with a key problem that running too slow. Researchers often adopt parallelization and statistical sampling to speed up the simulation. However, for simulating a large-scale datacenter with full system support, current work does not get an insight into such problems that how to reduce synchronous overheads introduced by large-scale parallel simulating, and organic blend of parallelization and statistical sampling. As a result, the performance of the state of the art simulator couldn't satisfy the practical requirement from industry and academy. .Aiming at key problems in large-scale datacenter simulation, this project focuses on three research areas of parallelization and statistical sampling. For resolving the problem of increasing synchronization overheads with the climbing of the node number, it researches the wall clock based synchronous protocol to reduce global synchronous frequency by converting global synchronization with low-overhead local synchronization. For avoiding great accuracy loss caused by relaxed synchronization, it also researches time error alignment mechanism based on the analysis model with the method of rolling back states, in order to verify the timing errors of micro-architectures. In another side, centering around the core issue of sampling, which is the problem of low sampling representation to parallel programs, it researches parallel-sense sampling algorithm in the way of twice pattern matching between samplings and the population as well as sampling filtering among candidates, for achieving a high sampling representation without disturbing notably load balance of backend parallel programs. The above researches give a promise that a breakthrough will be made in parallel sampling simulation technology, and contribute to design more reliable and efficient support tools for large-scale computer system development and application.
仿真速度过慢是体系结构级仿真器面临的关键问题,研究人员通常采用并行化或统计抽样技术来加速仿真。然而在大规模数据中心全系统仿真中,已有工作在如何减少大规模并行仿真所导致的同步开销以及如何有机结合并行与统计仿真技术等方面考虑不足,致使仿真器的性能难以达到实用的要求。本课题针对数据中心并行仿真中同步开销随规模增大快速攀升这一核心问题,研究基于墙钟的同步协议,将全局同步转换为低开销的本地同步,从而有效降低全局同步的频率;同时为避免同步放松所带来的仿真精度大幅下降,研究基于分析模型的误差补偿机制,回滚状态以修正仿真器的微体系结构时序误差;针对并行统计抽样仿真中程序样本代表性低这一关键问题,研究并行感知的抽样算法,采用模式匹配与样本筛选的方法,在提高样本代表性的同时有效避免了对并行化的干扰。通过以上研究,将推动并行抽样仿真技术取得实质性进展,为大规模计算机系统的开发应用提供高效可靠的支撑工具。
并行化是加速体系结构级仿真器的关键技术,然而在大规模数据中心全系统仿真中,传统研究工作在如何减少大规模并行仿真所导致的同步开销考虑不足,致使仿真器的性能难以达到实用的要求。本课题对数据中心并行仿真系统中同步开销与仿真规模的关联性,单消息误差与并行方式及同步机制的相关性进行了实验研究与理论分析。开发了全系统数据中心分布式仿真系统、众核处理器多线程并行仿真系统,完善了体系结构仿真器并行化方法理论。在对同步开销研究的基础上,验证了同步步长是并行仿真器性能主要制约因素这一结论,并且观察到仿真节点速度的差异是大步长下时序误差产生的不可或缺因素,进而提出墙钟同步协议,将全局同步转换为低开销的本地同步,从而有效降低全局同步的频率,丰富了体系结构并行仿真方法的同步理论。对单一消息误差的研究发现误差延迟在节点内的组件间按直接路径与间接路径传播,并且通常经过若干个关键枢纽,据此提出误差补偿机制,在关键枢纽上建立误差截留措施,从长延时事件中扣除途径的误差延迟以修正仿真器的微体系结构时序误差,完善了并行仿真器的建模方法。
{{i.achievement_title}}
数据更新时间:2023-05-31
惯性约束聚变内爆中基于多块结构网格的高效辐射扩散并行算法
基于贝叶斯统计模型的金属缺陷电磁成像方法研究
特斯拉涡轮结构参数影响分析及应用前景
人穷还是地穷?空间贫困陷阱的地统计学检验
基于Synchro仿真的城市干道交通信号协调控制优化
大规模数据中心的带宽分配关键技术研究
异构可扩展数据中心网络体系结构研究
面向大规模、超高速数据中心网络的关键控制技术研究
面向大规模图计算的FPGA加速器关键技术研究