There are a significant development in ihe deep machine learning and interactive query of big data in recent years, as a result, in-memory computing technology goes very fast and is been paying more and more attention. The Spark is currently the most popular im-memory computing platform, by using the feature of fast memory access, it enables the performance of applications to be promoted greatly. But it is hard for scientific applications to be run on the platform, the reasons are as follows: the C/C++ programing languages are not supported in Spark running environment, and POSIX-like data access interface not supported as well. it leads to directly degrade the performance of applications running on Spark platform. The PWA(Partial Wave Analysis) is an important method for data analysis in High Energy Physics domain, but PWA pplication just can be run on a single PC, and it can not meet the need of higher and higher event rate in the future,the project will do as follows: firstly we are going to develop some moduldes to construt the new Spark running evironment to support C++/C progarmming language execution, secondly a new machnism will be devoloped to make Alluxio memory file system to be accessed via Posix interfaces, thirdly we are going to optimize more efficient communication protocol beween worker nodes, and finally we will study the parallelization method of partial wave analysis in High Energy Physics, and parallelize the PWA method on Spark platform. We could expect that the project will provide a better in-memory computing solution to the difficulties in the scientific data analysis and computation domains.
近几年,深度机器学习及大数据交互式查询发展迅猛,内存计算平台Spark由此诞生并获得科学界极大关注,其最大的优势在于利用内存替代磁盘和集群的扩展性,使计算规模和执行效率得到大幅度提高。但现行的Spark平台并不支持通常科学计算的编程语言以及数据访问的标准接口,因此难以推广应用。分波分析方法是高能物理数据分析的重要手段,目前计算任务只能在单机上运行,不能满足未来更大规模任务的挑战性需求。本项目将研究开发能够支持C++/C等编译型编程语言运行的模块,构建兼容现有物理分析软件的集群计算平台;通过对Alluxio文件系统的访问接口的重新开发,支持程序对内存直接访问标准接口;优化Spark系统现有的通信机制实现不同节点之间的高效的通信;研究分波分析问题的并行算法,开发运行在Spark集群平台上分波分析的并行软件。可以预期,项目完成后新平台将具备很好的通用性,能够大大提高应用运行效率。
近几年,深度机器学习及大数据交互式查询发展迅猛,内存计算平台Spark由此诞生并获 得科学界极大关注,其最大的优势在于利用内存替代磁盘和集群的扩展性,使计算规模和执行 效率得到大幅度提高。但现行的Spark平台并不支持通常科学计算的编程语言以及数据访问的 标准接口,因此难以推广应用。分波分析方法是高能物理数据分析的重要手段,目前计算任务 只能在单机上运行,不能满足未来更大规模任务的挑战性需求。本项目开发了能够支持C+ +/C等编译型编程语言运行的模块,构建兼容现有物理分析软件的集群计算平台;通过对Allux io文件系统的访问接口的重新开发,支持程序对内存直接访问标准接口;优化Spark系统现有 的通信机制实现不同节点之间的高效的通信;设计了分波分析问题的并行算法,开发运行在Spar k集群平台上分波分析的并行软件。项目开发的新平台将具备很好的通用性,执行时间缩短了 45.2%∼78.5%,大幅提高数据存储和访 问的效率及程序的执行效率。 。
{{i.achievement_title}}
数据更新时间:2023-05-31
玉米叶向值的全基因组关联分析
基于分形L系统的水稻根系建模方法研究
正交异性钢桥面板纵肋-面板疲劳开裂的CFRP加固研究
硬件木马:关键问题研究进展及新动向
基于SSVEP 直接脑控机器人方向和速度研究
面向高能物理实验的云联盟关键技术研究
高能物理科学计算环境可信安全关键技术研究
面向混合内存的系统软件机理和关键技术研究
高能物理计算环境网络性能评估预测与优化关键技术研究