"The near-infrared drug supervision vehicles system" in China plays a very important role to guarantee drug safety. When the near infrared spectroscopy instruments are cooperatively used in the mode of networking, we termed them the near infrared (NIR) Internet of Things (IOTs), and in recently it begins to form a new trend. However, the standardization of the spectrum, the design of high-performance identification algorithm, and the capability of construction a huge amount of models in a very short time are great challenges posed by this kind of NIR IOTs. These problems will be deeply studied in this program. First, the spectrum differences among different instruments in measuring the same sample need to be corrected, and the conditional random fields (CRFs) will be chosen to cut off this kind of systematic bias and obtain the "true spectroscopy" of the sample. The second problem is that the fake drug identification model should handle the severe imbalance of number of the samples between the negative class and the positive one, and the cost sensitive which means that cost differs greatly when two kinds of identification errors occur. We will use the scalable convex hull maximum margin classifier(SCHMMC) to solve this problem, and whose strong generalization capability is assured by the theory of statistical learning theory. In the third place, the computing capability for massive models' building is poor for the commonly used PCs. So, the mainframe CPU-GPU heterogeneous platform is chosen for parallel computing, and those CPUs and GPUs are scheduled to collaboratively compute in the highest efficiency by a task-resources scheduler, which ensures the high performance parallel computing of SCHMMC modeling and accelerates the modeling with a very high speed-up ratio. The result of this project has great significance for aspects such as Chemometric, machine learning, has practical value for guarantee of drug safety, and helps to promote the development of China's fast inspection industry.
全国"车载近红外药品快速检测系统"为保障药品安全发挥着重要作用,其联网分析构成近红外(NIR)物联网成为发展趋势。但NIR光谱标准化、高性能鉴别算法设计和大量模型快速构建将是联网分析必须解决的难点问题。本项目将针对这些问题展开深入研究:应用条件随机场方法校正仪器误差对光谱的影响,获得样本的"真实光谱",尽可能达到样本光谱对仪器和测量环境的无关性;运用统计学习理论设计尺度化凸壳最大间隔分类器(SCHMMC),解决真假药鉴别应用中所存在的类不平衡和代价敏感问题,同时保证其具有强推广力;应用主流的CPU-GPU异构硬件平台,通过任务-资源的优化调度让CPU和GPU均以最高的效率协同计算,实现SCHMMC的高速并行计算,大幅提高光谱建模速度。本项目的研究成果在NIR光谱、化学计量学、机器学习等方面具有较大的理论意义,对保障药品安全具有实用价值,对我国快检行业和分析仪器物联网的发展具有较大的推动。
面向大规模网络化、分布式无损检测应用新需求,以全国药品检测车中的一项核心技术--近红外(NIR)光谱现场快速无损药品质量监督为应用背景,研究NIR光谱大数据分析和高性能建模的若干关键科学问题。针对多台NIR光谱仪在检测时存在台间差这一难题,研究并提出了一元线性回归直接标准化算法,可有效消除系统偏差。深入研究多种新型光谱建模算法:提出一种LAR 结合遗传偏最小二乘法的变量选择方法,可有效筛选出少数特征波长点,提高预测精度和速度;针对药品光谱数据中真假药品类别不平衡问题,融合平衡级联和稀疏分类方法,提出了级联的稀疏分类药品鉴别方法;基于概率估计模型提出了代价敏感稀疏表示分类算法。提出了基于波形叠加极限学习机的分类方法,分类模型具有快速学习能力,并对训练样本不敏感;提出了基于尺度化凸壳最大间隔分类算法。深入研究了深度学习算法在NIR光谱分析中的建模方法:结合自编码网络的强大学习能力和稀疏表示方法对数据不平衡特性的低敏感性特点,提出了基于稀疏降噪自编码、堆栈压缩自编码的的近红外光谱真假药品鉴别方法。为克服深度学习在小样本数据集下的过拟合问题,提出了基于随机隐退深度信念网络的近红外光谱药品鉴别方法,准确性和稳定性高,适合小样本环境下的近红外光谱分类建模。研究并提出了鉴于各种化学计算学算法,如真假药品鉴别方法等,均可以描述为最优化问题,面向大数据应用,其高效计算是需要解决的关键技术问题。为提高NIR光谱大数据建模效率,研究了CPU+GPU 异构环境下的并行协同计算、任务--资源优化调度方法,提出了基于CUDA的并行布谷鸟搜索算法,能够充分发挥GPU的并行计算能力,在求解速度上获得了高达110倍的计算加速比,开发了基于CUDA的PLS及若干深度学习算法。面向近红外光谱大数据的高效建模需求,提出在多GPU服务器上进行建模,提出了基于布谷鸟搜索的多处理器任务调度算法,能够有效利用多GPU并行处理能力,提高建模速度。本项目研究成果可为光谱色谱波谱质谱等领域的大数据分析、高性能计算,以及融合分布感知、计算、控制的信息-物理融合系统提供理论依据和方法支持。
{{i.achievement_title}}
数据更新时间:2023-05-31
路基土水分传感器室内标定方法与影响因素分析
1例脊肌萎缩症伴脊柱侧凸患儿后路脊柱矫形术的麻醉护理配合
MSGD: A Novel Matrix Factorization Approach for Large-Scale Collaborative Filtering Recommender Systems on GPUs
面向云工作流安全的任务调度方法
居住环境多维剥夺的地理识别及类型划分——以郑州主城区为例
中药过程质量控制的近红外光谱高性能模型融合
近红外光谱分析中的多模型共识建模方法研究
基于原始信号校正的高性能近红外光谱分析新方法研究
近红外日食光谱观测