化学学报 ›› 2010, Vol. 68 ›› Issue (24): 2595-2599. 上一篇    下一篇

研究论文

基于复小波能量谱的蛋白质活性位点识别新方法

田元新1,段文军1,邹小勇*,2   

  1. (1南方医科大学药学院 广州 510515)
    (2中山大学化学与化学工程学院 广州 510275)
  • 投稿日期:2009-10-29 修回日期:2010-07-31 发布日期:2010-08-27
  • 通讯作者: 邹小勇 E-mail:ceszxy@mail.sysu.edu.cn
  • 基金资助:

    国家自然科学基金 20975117;30772602

A Unique Method to Identify Protein Active Sites Based on Energy Scalogram of Complex Wavelet Transform

Tian Yuanxin1 Duan Wenjun1 Zou Xiaoyong*,2   

  1. (1 School of Pharmaceutical Sciences, Southern Medical University, Guangzhou 510515)
    (2 School of Chemistry and Chemical Engineering, Sun Yat-Sen University, Guangzhou 510275)
  • Received:2009-10-29 Revised:2010-07-31 Published:2010-08-27
  • Contact: Zou Xioa-Yong E-mail:ceszxy@mail.sysu.edu.cn

蛋白质活性位点的识别对于理解蛋白质的功能及计算机辅助药物设计具有重要的意义. 基于复小波能量谱建立蛋白质活性位点识别新方法, 采用Morlet复小波对数字化的蛋白质序列进行一维连续小波变换. 结果表明, 通过时-频分析, 能量集中区域往往与蛋白质的活性位点具有密切联系, 并且同源蛋白质序列的复小波能量最大值通常分布于相同的频率处, 表明小波功率谱在预测蛋白质活性位点方面具有广阔的应用前景.

关键词: 小波能量谱, 活性位点, 蛋白质

Identification of protein active sites is important to understand the function of protein and it is also crucial to the computer assistant drug design. In this paper, a unique method is presented to identify protein active sites based on time-frequency analysis by continuous wavelet transform (CWT). The numeric protein sequences are transformed by Morlet complex wavelet and protein active sites can be identified according to the energy scalogram. The results from hemoglobin and calcium binding protein indicate that the energy maximum of homologous proteins always appears in the same frequency, which is the characteristic frequency. In the energy scalogram, the time domain corresponding to characteristic frequency is the active sites of protein. The concentrated domains of energy are the critical domains of protein, which usually are conservative for homologous protein. Energy center can be utilized to identify active sites of protein. The proposed method has a potential to explore the function of protein.

Key words: wavelet energy scalogram, protein sequence, active site