化学学报 ›› 2011, Vol. 69 ›› Issue (16): 1845-1850.    下一篇

研究论文

一种新的化合物指纹及其在药物筛选中的应用

盛振, 黄琦, 康宏, 刘琦, 曹志伟, 朱瑞新   

  1. (同济大学生命科学与技术学院 上海 200092)
  • 投稿日期:2011-05-04 修回日期:2011-06-20 发布日期:2011-07-14
  • 通讯作者: 朱瑞新 E-mail:rxzhu@tongji.edu.cn

A New Fingerprint of Chemical Compounds and Its Application to Drugs Virtual Screening

SHENG Zhen, HUANG Qi, KANG Hong, LIU Qi, CAO Zhi-Wei, ZHU Rui-Xin   

  1. (School of Life Science and Technology, Tongji University, Shanghai 200092)
  • Received:2011-05-04 Revised:2011-06-20 Published:2011-07-14
  • Contact: Ruixin Zhu E-mail:rxzhu@tongji.edu.cn

相似性搜索技术在大规模药物筛选中有着广泛的应用, 而作为其构成要素之一的化合物描述符, 则在相似性搜索中起着至关重要的作用. 但是迄今为止, 尚未发现一种描述符能够全面的描述化合物. 近来, 融合不同结构描述符用于相似性搜索的研究屡见报道, 不过由于这些描述符都源自化合物的结构, 融合以后不仅不能保证对化合物进行更全面的描述, 还带来严重的冗余现象. 为此, 根据哲学中对于一个事物的描述需要同时从本质与外延两个方面同时进行这一基本原理, 构建了一个全新的化合物活性描述符: 基因功能模块指纹(GO指纹), 综合运用结构指纹(本质)和GO指纹(外延)来描述化合物. 与将化合物基因表达直接构建的活性指纹不同, GO指纹不仅(1)降低了芯片数据的维度, 避免了其维度高、相关性强、噪声大的问题; 同时(2)拉近了描述符和化合物活性之间的距离. 通过将GO指纹和结构指纹融合后用于化合物相似性搜索, 结果表明新的描述符(1)使得结构和功能都相似的化合物之间的相似性更高; (2)而使得仅在单方面相似性较高的化合物得到有效排除. 本研究为进行快速、高效、大规模的药物筛选提供了新的思路, 这势必将提高药物筛选的结果, 进一步加快新药研发和旧药新用的进程.

关键词: 结构指纹, 功能指纹, 基因功能模块, 药物筛选, 化合物描述符

Similarity searching is one of the most widely used techniques for virtual screening in large-scale drug discovery programmes. As one of the principal components, molecular descriptors play a crucial role in similarity searching. However, there is not yet one individual set of descriptors describing compounds perfectly so far. Recently, some research groups carried out similarity searching by fusing different fingerprints types. As these descriptors are all based on the structure of compounds, their fusions are more likely to cause redundancy far from describing compounds more comprehensively. Here, the Gene Ontology (GO) fingerprint is presented as a novo type of descriptor on the basis of GO terms and used to describe compounds together with structure fingerprint under the philosophy principle of describing substance from both nature and extend. The application of GO fingerprint reduces the dimensions of microarray data and avoids its problems such as high dimensions, strong correlation and so on. It also shortens the distance between the descriptors and the biological activities of compounds. Comparing of the similarity searching results derived from the methods using the structure fingerprint or the GO fingerprint only, the integrated method using both types shows much better description ability in two aspects. First, it can give a higher score to the compounds similar to the query in both structure and bioactivity. Second, it can effectively get rid of the compounds special in one side. As a conclusion, we propose a novo way for fast, effective, high-throughput drug discovery. It is expected to improve the results of virtual screening and accelerate the process of drug discovery and reuse of old drugs.

Key words: structure fingerprint, function fingerprint, gene ontology, drug screening, molecular descriptors

中图分类号: