RNA single molecule nanopore detection technology has been considered as one the novelist and burgeoning detection technology in the fields of genomics. However, on account of the limitation of sensitivity and accuracy caused by existing equipment and laboratory techniques, the current variation among the basic groups of RNA is still minute and the noise is still relative high, which leads the high identification error of RNA bases and bases modifications. Therefore, it is necessary to establish a machine learning based identification method in order to enhance the performance of nanopore detection in terms of detection accuracy...Hence, in this project, we aim to solve the problem of the low signal to noise ratio of RNA single molecule nanopore detection technology and develop an novel signal-noise separation method which based on time-frequency domain transformation and wavelet analysis. Then, focusing on the mapping problem caused by large-scale, high dimension and non-linearity of nanopore genetic data, in this project, it is plan to develop an efficient nonlinear dimension reduction method based on t-distributed stochastic neighbor embedding (t-SNE). Thirdly, in order to recognize and classify the bases and bases modifications under the situations of class-imbalance in terms of features of gene bases, it is planed to propose an feature recognition and classification methods based on class-imbalance learning and long-short term memory (LSTM) network. Finally, all the proposed methods will be integrated into a multiplex cancer microRNAs detection framework, to enhance the performance of RNA single molecule nanopore detection in the filed of bases modifications identifications.
RNA单分子纳米孔检测技术作为目前基因组学中新兴的检测技术。但是,由于现有仪器和实验技术测量灵敏度的限制,碱基间电流变化微弱、噪声高使得现有碱基识别错误率较高,亟需利用机器学习手段提高纳米孔检测方法的准确率。.因此,本项目针对RNA单分子检测的低信噪比问题,拟研究基于时频域变换及小波分解的信噪分离方法;针对纳米孔检测信号大规模高维非线性问题,研究基于随机邻域嵌入的非线性映射降维方法;针对基因数据的特征数据量不平衡、碱基修饰特征无法有效识别的问题,研究基于长短期记忆网络和不平衡学习的特征识别方法,最终集成一套能够用于癌症microRNA分子碱基特征识别的机器学习综合策略,提高纳米孔检测的准确率。
以纳米孔检测技术及X射线衍射透镜为代表的高通量分析筛选技术(High Throughput Screen, HTS)是有机结合分子生物学、细胞生物学、计算机、自动化控制等高新技术而形成的新的技术体系。该技术使得同一时间内进行大量的并行实验成为现实。然而由于高通量筛选电学信号的产生机理和数据特征,通过人工知己分析数据具有很大的困难和挑战。同时机器学习等人工智能方法为大批次高通量的数据高效分析提供了条件。本项目围绕电学信号的低信噪比问题,针对高维信号中存在的低信噪比、数据降维和特征提取问题、时频域分析和特征识别问题进行了系统性的研究,并取得相关研究成果。
{{i.achievement_title}}
数据更新时间:2023-05-31
玉米叶向值的全基因组关联分析
一种光、电驱动的生物炭/硬脂酸复合相变材料的制备及其性能
正交异性钢桥面板纵肋-面板疲劳开裂的CFRP加固研究
硬件木马:关键问题研究进展及新动向
基于SSVEP 直接脑控机器人方向和速度研究
手性DNA碱基的纳米孔单分子研究
基于纳米颗粒编码的DNA单碱基突变检测技术研究
室温下微弱振动信号的单分子定位检测
近交系小鼠遗传检测的单碱基延伸标签反应基因芯片研究