Multi-core cluster has been becoming a main force to support high performance computing, which is a certain choice to run simulation of complex systems. However, the characteristics of multi-core cluster architecture, including deep parallelism, hierarchical communication and hybrid memory etc. leads to low usage of computational and communicational resource, slow time synchronization, unbalance of workload and other problems in present parallel simulation platforms. As a result, it meets considerable difficulties to make full advantage of the power of multi-core cluster to improve simulation capability. To solve these problems, the project plans to make innovation and breakthrough on computational resource scheduling, GVT computation and load balancing through: 1) proposing a framework for scheduling computational resources which processes simulation events in highly concurrency, maintains good scalability and dispel contentions caused by ordering events; 2) proposing a hybrid GVT computation algorithm to calculate GVT asynchronously both within and among cluster nodes, which can achieve efficient time synchronization; 3) proposing a self-adaptive algorithm for load balance and flow control to regulate the workload and processing rate on computing cores, which can achieve optimal global advance rate. Our work has noticeable theoretical significances and practical values for upgrading analytical ability of complex system simulation by harnessing multi-core clusters and boosting the adaption of high performance computer in the field of complex system simulation.
多核集群正迅速成为当前高性能计算的主要力量,是运行大规模复杂系统仿真的重要平台。然而多核集群深度并行、层次化通信以及混合存储等特点导致已有并行仿真平台存在计算和通信资源利用率低、时间同步慢、负载失衡等问题,难以可持续地将多核性能增益转化为仿真能力的提升。针对上述问题,本项目拟从计算资源调度、GVT计算、负载平衡和流量控制三个方面实现创新和突破:研究计算资源调度框架,高度并发地处理仿真事件,实现仿真规模上良好的灵活性和可扩展性,同时消解由事件排序等引起的竞争冲突;研究混合异步GVT算法在集群节点内部和节点之间异步地完成高效的时间同步;研究自适应负载平衡和流量控制算法综合平衡各计算单元的负载和事件处理速率,实现最优的全系统推进速率。本项目研究成果对于利用高效能多核计算机提高复杂系统仿真实验能力、推进新一代高效能计算机系统在复杂系统仿真领域的应用等具有十分重要的理论意义和实用价值。
多核集群是当前和今后一段时期运行并行仿真的主流平台之一,然而多核集群深度并行、高度异步、层次化通信和混合存储等特点使得传统的面向对称多处理机和分布式集群的仿真平台难以充分发挥多核集群性能。对此,本项目从并行仿真引擎组成结构、高效通信、GVT 计算、自适应负载平衡及流量控制和大规模并行仿真应用等方面入手,研究提出了一种可扩展可配置的多线程 PDES 体系架构及一种节点内可靠、保序且零复制的共享内存通信算法,一种基于大规模多核集群的混合异步 GVT 算法,一套面向 plug-in-play 模式的仿真对象接口和一种自适应负载平衡与窗口控制算法克服了多核集群环境下制约并行仿真性能的若干瓶颈问题,在上述技术基础上设计实现了面向多核集群的并行仿真引擎,并基于该引擎进行了基于离散事件仿真的新冠疫情演化和防控研究。项目成果对于提高多核集群环境下并行仿真性能、推动新型高效能计算机在仿真领域应用等具有重要的理论意义和实用价值。
{{i.achievement_title}}
数据更新时间:2023-05-31
黄河流域水资源利用时空演变特征及驱动要素
硬件木马:关键问题研究进展及新动向
面向云工作流安全的任务调度方法
惯性约束聚变内爆中基于多块结构网格的高效辐射扩散并行算法
资源型地区产业结构调整对水资源利用效率影响的实证分析—来自中国10个资源型省份的经验证据
面向多核虚拟集群的并行应用性能优化方法研究
基于Agent的并行仿真支撑技术研究
并行与分布式仿真支撑平台关键技术研究
针对多核系统存储层次增强数据并行性能的软件支撑技术研究