This project focuses on the research of hardware architecture for computing acceleration of sequential speech recognition intelligent network algorithm. For different application scenarios, the overall network structure and computing details of speech recognition algorithms will be different, so it is difficult for general reconfigurable hardware to meet the diverse network implementations and computing processes flexibly and efficiently. In order to improve the calculation energy efficiency ratio as much as possible, the special parallel computing structure is also the research focus. In addition, general hardware architectures are difficult to support the complex and changeable network structures and the requirements of parallel computing, so the programmability and instruction are also essential. In view of the above three points, this project researches and designs a dual reconfigurable and multi-level parallel computing hardware architecture, which combines coarse and fine granularity, and the separate configuration and shared memory mechanisms for its supporting. It can meet the reconfigurable requirements of the network structure and computing details from two aspects of coarser and finer, and support the parallel computing in different level to improving the flexibility of various parallel scheduling. At the same time, we research and design a special flow-based segmented variable-length instruction set, which is compatible with ordinary and super-long instruction words, to support the reconstruction of hardware architecture and various characteristics of computation. Ultimately, the hardware architecture will be implemented in the way of high-performance FPGA system, and the energy efficient ratio is expected to be one order of magnitude higher than that of GPU.
本项目针对序列化语音识别智能网络算法计算加速的硬件架构进行研究。对于不同的应用场景,语音识别算法的网络整体结构和计算细节都会有不同程度的变化,因此一般的可重构硬件很难灵活又高效的满足多样的网络实现和计算过程。为了尽可能的提升计算能效比,专用的并行计算结构也是设计重点。此外,单纯的硬件架构很难支撑复杂多变的网络结构和并行计算要求,因此可编程性和指令化也是必不可少的。本项目针对以上三点,研究和设计一种粗细粒度双重可重构多级并行计算的硬件架构,及其配套的独立配置和共享内存机制。从较粗和较细两个角度满足网络结构和计算细节的重构要求;支持模块级、运算级等多个级别的并行计算,提高各种并行调度的灵活性。同时,研究和设计一种专用的流式分段变长指令集,兼容普通和超长指令字,为实现硬件架构的重构和计算的各种特性提供支持。最终将以高性能FPGA系统的方式实现该硬件架构体系,预计能效比较GPU高出一个数量级。
本项目针对典型语音识别算法网络结构、计算细节不同程度的变化,研究实现了一种双重可重构多级并行计算硬件系统。主要研究内容有:1对合作单位提出的常用语音识别算法的特征和基本计算结构进行了分析研究,确定了要实现的可重构计算单元、计算模块;2完成了双重可重构、多级并行计算硬件计算架构、计算单元的研究和优化;3完成了合作单位提出的算法示例优化、高精度参数量化和自定义指令的研究,完成了对优化后算法的重构性硬件实现;4完成了基于FPGA的硬件系统研发、典型语音识别算法硬件化部署、系统调试测试。经仿真测试,各硬件单元计算精度和资源占用满足设计需求。对比了合作单位语音识别算法示例在相同计算量下本系统与当前两款常见GPU的计算速率和动态功耗,结果显示在能效对比上本系统要优于GPU一个数量级左右。本研究能够为典型序列化语音识别智能网络算法提供一套完整的可重构、并行化、多样化、高效能的硬件化实现方案;该系统的研究设计增加硬件加速器的适用范围,利用半定制化的实现方法配合针对性的数据流水和存储设计,能够灵活适用于不同程度的算法结构变动;也为系统的芯片化打下良好的基础。
{{i.achievement_title}}
数据更新时间:2023-05-31
硬件木马:关键问题研究进展及新动向
居住环境多维剥夺的地理识别及类型划分——以郑州主城区为例
基于细粒度词表示的命名实体识别研究
物联网中区块链技术的应用与挑战
基于协同表示的图嵌入鉴别分析在人脸识别中的应用
可重构深度/光流/编码运动矢量估计算法及硬件电路架构研究
可重构环境下软硬件协同设计的算法研究
算法级功能可重构的数字硬件体系结构及实现方法研究
基于可重构的多并行计算模式视觉识别系统研究