VLIW (Very Long InstructionWord) architecture has been widely used in embedded processors. VLIW processors issue and execute multiple operations in parallel, on different functional units at each processor cycle. A major problem with VLIW processors is that a single register file hampers the scalability of the processor. Clustering is an efficient technique for improving the scalability and energy consumption of VLIW processors. In a clustered VLIW processor, each cluster has its own functional units and local register file with fewer registers and ports. Clusters are connected by an inter-cluster communication network. An optimising compiler plays a key role in improving the ILP (Instruction Level Parallelism) for clustered VLIW processors. Instruction scheduling and register allocation are two important parts in an optimising compiler for clustered VLIW processors. These two parts are closely related and have a significant impact on the ILP. Software pipelining is an important instruction scheduling technique for efficiently exploiting the ILP of loops by overlapping the execution of successive iterations. Modulo scheduling is a class of software pipelining algorithms that has been incorporated into some product compilers. Clustered VLIW processors make instruction scheduling, register allocation and modulo scheduling more challenging. Bad cluster assignment may cause unnecessary intercluster communications and uneven register pressure, increasing the execution time of the program. This research focuses on register aware modulo scheduling, leakage aware modulo scheduling, as well as leakage aware instruction scheduling and register allocation for clustered VLIW processors. We will propose an instruction scheduling heuristic integrated with register allocation, which can effectively reduce the leakage energy consumption of the functional units on clustered VLIW processors. We will propose an efficient register aware modulo scheduling heuristic such that the execution time of the whole loop can be minimised by overlapping the iterations of the loop. We will also propose a register aware modulo scheduling heuristic such that the leakage power of the functional units on clusterred VLIW processors can be reduced. These heuristics will be implemented in Trimaran compiler. We will also present accurate energy model and performance model to evaluate the power consumption and execution time of the program.
VLIW(Very Long Instruction Word)体系结构在嵌入式处理器中得到了广泛的引用。分簇是改进VLIW处理器可扩展性及能量消耗的一种有效技术。通过编译技术优化应用程序的性能,以及在不影响程序运行性能的条件下最小化系统或处理器的运行功耗,是目前编译优化技术的研究热点。分簇VLIW体系结构带来的簇间指令分配问题,对编译器中的指令调度、寄存器分配、软件流水等问题提出了更大的挑战。本项目将通过指令调度、寄存器分配、软件流水等编译技术有效地优化程序在分簇VLIW处理器的运行时间和功耗。针对串行程序提出有效地降低分簇VLIW DSP处理器功能部件功耗的指令调度和寄存器分配算法。针对循环分别提出有效优化分簇VLIW DSP处理器性能以及功耗的指令调度和寄存器分配算法。将算法实现和应用在Trimaran编译器中。此外,提出准确的性能模型和功耗模型来评估程序的运行时间和能量消耗。
VLIW(Very Long Instruction Word)体系结构在嵌入式处理器中得到了广泛的引用。分簇是改进VLIW处理器可扩展性及能量消耗的一种有效技术。通过编译技术优化应用程序的性能,以及在不影响程序运行性能的条件下最小化系统或处理器的运行功耗,是目前编译优化技术的研究热点。分簇VLIW体系结构带来的簇间指令分配问题,对编译器中的指令调度、寄存器分配、软件流水等问题提出了更大的挑战。本项目通过指令调度、寄存器分配、软件流水等编译技术有效地优化程序在分簇VLIW处理器的运行时间和功耗。针对串行程序提出有效地降低分簇VLIW DSP处理器功能部件功耗的指令调度和寄存器分配算法。针对循环分别提出有效优化分簇VLIW DSP处理器性能以及功耗的指令调度和寄存器分配算法。我们正在将这些算法实现到Trimaran编译器中。本项目成果目前有国际会议论文两篇及申请国家专利两项。
{{i.achievement_title}}
数据更新时间:2023-05-31
基于分形L系统的水稻根系建模方法研究
DeoR家族转录因子PsrB调控黏质沙雷氏菌合成灵菌红素
农超对接模式中利益分配问题研究
拥堵路网交通流均衡分配模型
低轨卫星通信信道分配策略
面向科学计算流处理器的编译存储优化关键技术研究
多核多线程处理器SIMD扩展的编程模型及编译优化关键技术研究
面向片内多处理器的动态编译与优化研究
编译时多处理机资源调度方法及同步延迟优化技术的研究