Existing medical images storage techniques often cannot efficiently support optimization for massive medical images based anlaytical tasks. To solve this problem, we conduct research in this projcet in following issues: (1) Based on the physical characterics of flash devices, we propose a cloud based hybrid (row-column) storage technique. Through utilize the column based data sotrage schema, and benefit from the advantage that coloumn storage often has significanly less I/O overheads, we can solve the problem that conventional medical images storage cannot opitimze the performance of medical images based analytical tasks. (2) Towards the medical images stored in coloumn based schema, we devise several precomputed distance metrics and its corresponding pruning rules, and integrate them into a newly designed R-Tree based distributed multidimensional index. Then the index can reduce both I/O and computational overheads at the same time. By using this indexing technique for data access and analysis, various cost will significanlty decreased. (3) As to the fast and continuously arriving massive medical images related data which stored in row based schema, we propose a distributed append-efficient multidimensional index to quickly archive and indexing incoming data. It employ a newly designed data insertion algorithm, whose node split strategy is significantly different form traditional R-Tree like techniques due to we revise some basic rules and inherent characteristics of R-Tree, and abandon the optimization for update, due to the update is not necessary for medical data during its lifecycle. Benefit form this design, this index will rarely get node split and hence it will have only a litter bit I/O and computational cost during indexing. This means it can be utilized to replace traditional mecical image storage system, which often involve expensive node split, heavy I/O and computational overheads, and cannot archive and indexing the continuously arriving data in near real-time. The study of this project will lead several output of cloud based medical images storage techniques, it not only useful to meical image storage managemnt, but also applicable to many unstructured or multimeda data domain. Therefore, the techniques proposed in this proposal will have big potential commercial value.
现有的医学图像存储技术,对基于医学图像的分析型任务提供的性能优化支持较为有限。在本项目中,将针对此问题,研究新型存储管理技术。包括: (1)基于闪存设备的物理新特性,研究适用于云环境的行列混合存储技术。通过引入行列混合存储自适应机制,提供面向列的存储模式,解决医学影像数据存储管理系统由于没有利用存储模式的特点,从而较难高效的对分析型任务提供性能优化支持的问题; (2)对于以列模式存储的医学影像数据,通过研究可预计算的距离度量,及相应的剪枝规则,设计出能同时降低I/O和计算开销的分布式索引技术。通过该索引进行数据访问和数据处理时,医学图像的分析型任务处理过程中的I/O和计算开销,将被有效减少; (3)对于以行模式存储的医学影像相关数据,研发分布式索引技术,解决数据连续的以高速进入系统时,现有索引技术结点分裂频繁、I/O和计算开销大、无法为数据存储系统实时的构建和维护索引的问题。
在基于云计算平台的应用环境中,医学影像数据存储管理系统面临着新的技术挑战:设计高效的存储管理系统,有效的优化基于海量医学图像的分析型任务的性能,从而为各种医疗云应用提供更好的基础系统支持。本项目的研究,即是围绕如何解决上述技术挑战中的关键技术问题而展开的。研究内容和目标可概述为:在云计算环境中,研究如何基于闪存这样的新型硬件,设计和实现高性能的行列混合存储技术和分布式多维索引技术,从而优化基于海量医学图像的分析型任务的 I/O 和计算等开销。..经过4年的研究攻关,在“内存-闪存-磁盘”混合存储、面向检索优化的高维索引,高性能的多维索引构建技术等方面,项目组取得了一系列的研究成果。包括:基于混合存储的数据替换策略CRSR和CRSR+,同时优化检索过程中的I/O和计算开销的高维索引DPR-Tree及其kNN剪枝技术PDP,追加高效的多维索引AER-Tree,位图索引的分布式并行构建优化技术等。上述技术成果大幅优化了医疗影像的高维特征数据的存储和存取性能,增强了存储层对分析应用层的性能支撑。项目团队取得了预期的研究成果,完成了预期的考核指标,共发表论文37篇。其中,中文核心期刊论文7篇,EI收录论文19篇。申请发明专利9项(其中授权1项),登记软件著作权3项。课题主要参与人陈梅获批国家自然科学基金地区基金项目1项。..项目组在基于“内存-闪存-磁盘”混合存储系统的缓存优化技术方面的技术成果,已经应用到项目负责人李晖团队与国家天文台FAST(500米口径球面射电望远镜)科学部朱明研究员团队合作研发的海量巡天数据存储管理系统FastDB系统中。FastDB系统是FAST工程巡天数据科学研究的支撑软件,采用了项目组研发的CRSR+等混合存储管理技术,在数据存储和存取性能上取得了较大的提升。.
{{i.achievement_title}}
数据更新时间:2023-05-31
论大数据环境对情报学发展的影响
低轨卫星通信信道分配策略
资源型地区产业结构调整对水资源利用效率影响的实证分析—来自中国10个资源型省份的经验证据
多源数据驱动CNN-GRU模型的公交客流量分类预测
Wnt 信号通路在非小细胞肺癌中的研究进展
云平台上基于海量医学图像并行数据挖掘的计算机辅助诊断技术研究
遥感云服务平台中海量影像数据完整性证明研究
基于云计算的海量网络数据管理与搜索技术
适应高并发写操作的云存储平台核心技术研究