Acta Chimica Sinica ›› 2023, Vol. 81 ›› Issue (8): 912-919.DOI: 10.6023/A23040113 Previous Articles     Next Articles

Special Issue: 庆祝《化学学报》创刊90周年合辑

Article

一种时序信号分类算法在纳米孔道离子电流信号识别中的应用

倪雪a, 辛凯莉b, 胡正利b,*(), 蒋翠玲a,*(), 万永菁a, 应佚伦b, 龙亿涛b   

  1. a 华东理工大学信息科学与工程学院 上海 200237
    b 南京大学化学化工学院 分子传感与成像中心 南京 210023
  • 投稿日期:2023-04-03 发布日期:2023-09-14
  • 作者简介:
    庆祝《化学学报》创刊90周年.
    † 共同第一作者
  • 基金资助:
    项目受科技部重点研发计划(2022YFA1304604); 国家自然科学基金(22106066); 国家自然科学基金(22027806)

A Time-Series Signal Classification Algorithm and Its Application to Nanopore Ionic Current Signal Identification

Xue Nia, Kaili Xinb, Zhengli Hub(), Cuiling Jianga(), Yongjing Wana, Yi-Lun Yingb, Yi-Tao Longb   

  1. a School of Information Science and Engineering, East China University of Science and Technology, Shanghai 200237
    b School of Chemistry and Chemical Engineering, Molecular Sensing and Imaging Center (MSIC), Nanjing University, Nanjing 210023
  • Received:2023-04-03 Published:2023-09-14
  • Contact: *E-mail: zhenglihu@nju.edu.cn; cuilingjiang@ecust.edu.cn
  • About author:
    Dedicated to the 90th anniversary of Acta Chimica Sinica.
    † These authors contributed equally to this work.
  • Supported by:
    Ministry of Science and Technology Key R&D Program of China(2022YFA1304604); National Natural Science Foundation of China(22106066); National Natural Science Foundation of China(22027806)

Nanopore-based single molecular analysis technique usually uses time-domain features such as time-current scatter plots of blocking currents for event recognition. However, as the time-domain features overlap with each other, the substances with extremely similar molecular structures are difficult to be accurately discriminated using traditional nanopore recognition methods. The differences in the deep feature representations need fully explored to obtain credible recognition results, thus improving the recognition accuracy of nanopore ionic current signals. Here, a time-series signal classification algorithm is proposed in this paper: firstly, the original signal is framed with overlapping sliding windows to generate sub-signals and extract their shallow feature information; then a time-series signal classification network based on Emphasized Channel Attention, Propagation and Aggregation in time delay neural network (ECAPA-TDNN) is proposed to develop a multi-branch inter-layer feature fusion model for deep feature extraction, where the multi-branch multi-level attention module of this model (RepVGG-SE-Res2Block, RSR-Block) obtains multi-scale features by constructing a feature pyramid structure within each residual block, reduces the inference speed based on structural reparameterization techniques while ensuring the model performance, and introduces Adaptively Spatial Feature Fusion (ASFF) to fuse the features of different layers in the network; finally, a credible statistical prediction strategy is used to obtain reliable classification results by counting the classification probabilities of sub-signals. The experimental results show that for the peptide sequences N'-DDFFIFFDD-C' (DF_I) and N'-DDFFLFFDD- C' (DF_L) containing only the different amino acids I (isoleucine) and L (leucine), which are isomers of each other, the algorithm achieves a recognition accuracy of 99.00%, obviously improving the sensing capability of nanopores for single molecules with similar or even identical molecular weights.

Key words: nanopore analysis, time-series signal, deep feature extraction, inter-layer feature fusion, credible statistical predictions