The study makes researches on optimizations of irregular applications on GPU platform, in terms of load imbalance and poor data locality. The study aims to solve the problem of low efficiency of parallel irregular applications which are widely seen in large-scale scientific computing. The model of irregularity degree is constructed, which reveals the relationship between load imbalance and irregularity degree, providing the theoretical basis for optimizations of parallel applications; Considering the characteristics of dynamic task generation and multi-level load imbalance, a load-sensitive multi-granularity load balance algorithm is proposed. A load-sensitive task partition algorithm is first presented by constructing a model of dynamic task partition, and then a multi-granularity work-stealing strategy is proposed to further optimize the applications with multi-level load imbalance. A resource-driven cache bypassing is presented to solve the problem of cache thrashing caused by poor data locality. First, a priority based warp bypassing strategy by adaptive warp throttling is proposed to alleviate effectively cache thrashing while guarantee efficient resource utilization. And then we present a reuse-distance based instruction bypassing approach, which decreases cache pollution through a locality sensitive cache replacement strategy. This study aims to provide the technical supports and the core algorithms for parallel optimizations of irregular applications, which are expected to improve the parallel efficiency of irregular applications in large-scale scientific computing.
针对大规模科学计算中非规则应用并行效率低的问题,从负载均衡及数据局部性角度研究面向GPU的非规则应用并行优化技术。通过构建非规则度模型,建立非规则度与负载均衡的内在联系,为并行优化提供理论依据;结合非规则应用动态任务生成及多层次负载不均衡的特性,提出一种负荷敏感的多粒度负载均衡方法,通过构建动态任务划分模型实现负荷敏感的任务划分算法,研究多粒度任务窃取技术以解决多级负载不均衡问题;针对数据局部性引起的cache抖动问题,提出一种资源驱动的cache bypassing方法,通过自适应warp节流实现基于优先级的warp bypassing策略,保证资源充分利用的前提下缓解cache抖动。研究局部性敏感的cache替换策略,提出基于重用距离的指令bypassing方法以减少cache污染。本项研究旨在为非规则应用并行优化提供技术支撑及核心算法,有望提高大规模科学计算中非规则应用的并行效率。
本课题针对大规模科学计算中普遍存在的非规则应用并行效率低的问题,从负载均衡及数据局部性角度研究了非规则应用并行优化技术。通过构建非规则度模型,建立非规则度与负载均衡的内在联系,为并行优化提供理论依据。鉴于非规则应用并行执行时多级负载不均衡并存的情况,结合非规则应用动态任务生成的属性,构建一种负荷敏感的多粒度负载均衡方法,提出负荷敏感的任务划分算法以动态实现任务均分,实现多粒度任务窃取技术,有效缓解了非规则应用并行执行时多级负载不均衡并存的问题。针对数据局部性引起的高速缓存抖动问题,本课题提出了一种资源驱动的高速缓存旁路方法,通过自适应warp节流实现基于优先级的warp旁路策略,由此既保证了资源的充分利用,同时还有效缓解了一级高速缓存的抖动现象。针对局部性敏感的高速缓存替换策略进行了研究,提出了一种基于重用距离的指令旁路方法从而减少高速缓存的污染。针对典型的非规则测试用例以及传感器数据分析、图像处理等实际应用评价本项目所提理论及算法,实验证明所提的算法有效地改善了负载均衡及数据局部性,明显提高了非规则并行应用效率。本项研究为非规则应用并行优化提供技术支撑及核心算法,为提高大规模科学计算中非规则应用的并行效率提供参考。依托该项目发表SCI、EI、CSCD及核心期刊论文9篇,授权发明专利1项、处于实审阶段专利3项,培养研究生4人、青年教师晋升职称2人。
{{i.achievement_title}}
数据更新时间:2023-05-31
拥堵路网交通流均衡分配模型
低轨卫星通信信道分配策略
惯性约束聚变内爆中基于多块结构网格的高效辐射扩散并行算法
一种改进的多目标正余弦优化算法
基于混合优化方法的大口径主镜设计
面向不规则GPU应用的分析与优化技术研究
面向高速网络监控的并行频繁项挖掘及GPU优化关键技术研究
面向存储受限应用的GPU性能预测模型和通信优化关键技术研究
面向异构众核系统的非规则问题优化技术研究