综述

人工智能助力当代化学研究

  • 朱博阳 ,
  • 吴睿龙 ,
  • 于曦
展开
  • a 天津大学化学系 天津 300072;
    b 天津市分子光电科学重点实验室 天津 300072
朱博阳.主要研究方向:人工智能.程序后端开发.
吴睿龙.2017年考入天津大学,现为理学院化学系大三学生.目前在于曦老师指导下从事机器学习用于化学体系的大学生创新项目.
于曦.天津大学理学院化学系、物理系研究员、教授.主要研究方向为微观体系电荷的量子输运、分子电子学和微纳光电器件,以及人工智能辅助的有机光电材料开发.

收稿日期: 2020-07-12

  网络出版日期: 2020-08-21

基金资助

项目受国家自然科学基金(Nos.21973069,21773169,21872103)、国家重点研究开发计划(Nos.2017YFA0204503,2016YFB0401100)、天津大学北洋青年学者计划(No.2018XRX-0007)和天津大学大学生创新创业训练计划(No.201910056451)资助.

Artificial Intelligence for Contemporary Chemistry Research

  • Zhu Boyang ,
  • Wu Ruilong ,
  • Yu Xi
Expand
  • a Department of Chemistry, Tianjin University, Tianjin 300072, China;
    b Tianjin Key Laboratory of Molecular Optoelectronic Sciences, Tianjin 300072, China

Received date: 2020-07-12

  Online published: 2020-08-21

Supported by

Project supported by the National Natural Science Foundation of China (Nos. 21973069, 21773169, 21872103), National Key R&D Program (Nos. 2017YFA0204503, 2016YFB0401100), the PEIYANG Young Scholars Program of Tianjin University (No. 2018XRX-0007) and the College Student Innovation and Entrepreneurship Training Program of Tianjin University (No. 201910056451).

摘要

以机器学习为代表的人工智能在当代的科学研究中正在发挥越来越重要的作用.不同于传统的计算机程序,机器学习人工智能可以通过对大量数据的反复分析和自身模型的优化,即“学习”过程,从而在大量的数据中寻找客观事物的相互联系,形成具有更好预测和决策能力的新模型,做出合理的判断.化学研究的特点恰恰是机器学习人工智能的强项.化学研究经常要面对十分复杂的物质体系和实验过程,从而很难通过化学物理原理进行精准的分析和判断.人工智能可以挖掘化学实验中产生的海量实验数据的相关性,帮助化学家做出合理分析预测,大大加速化学研发过程.本文介绍了当代人工智能方法及用其解决化学问题基本原理,并通过具体案例展示了人工智能辅助解决不同化学研发问题的方法以及对应的机器学习算法.将人工智能运用在化学科学的尝试正处于蓬勃上升期,人工智能已经初步展示出对化学研究的强大助力,希望本文能帮助更多的国内的化学工作者了解和运用这一有力的工具.

本文引用格式

朱博阳 , 吴睿龙 , 于曦 . 人工智能助力当代化学研究[J]. 化学学报, 2020 , 78(12) : 1366 -1382 . DOI: 10.6023/A20070306

Abstract

Artificial intelligence (AI), especially the machine learning, is playing an increasingly important role in contemporary scientific research. Unlike the traditional computer program, machine learning can analyze a large number of data repeatedly and optimize its own model, a process which is called a "learning process". So that the AI can find the relationship underling the experiments from a large number of data, form a new model with better prediction and decisionmaking ability, and make an optimized strategy. The characteristics of chemical research just hit the strengths of machine learning. Chemical research often faces very complex material system and experimental process, so it is difficult to accurately analyze and making judgment through physical chemistry principles. Artificial intelligence can mine the correlation of massive experimental data generated in chemical experiments, help chemists make reasonable analysis and prediction, and therefore greatly accelerate the process of chemical research. This review presents the modern artificial intelligence method and its basic principles on solving chemical problems, by representative examples with specific machine learning algorithm. The application of artificial intelligence in chemical science is in a period of vigorous rise. Artificial intelligence has initially shown a powerful assist to chemical research. We hope this review can help more domestic chemical workers understand and use this powerful tool.

参考文献

[1] Tang, Z. T.; Shao, K.; Zhao, D. B.; Zhu, Y. H. Control Theory & Applications 2017, 034, 1529(in Chinese). (唐振韬, 邵坤, 赵冬斌, 朱圆恒. 控制理论与应用, 2017, 034, 1529.)
[2] McKinney, S. M.; Sieniek, M.; Godbole, V.; Godwin, J.; Antropova, N.; Ashrafian, H.; Back, T.; Chesus, M.; Corrado, G. C.; Darzi, A.; Etemadi, M.; Garcia-Vicente F.; Gilbert, F. J.; Halling-Brown, M.; Hassabis, D.; Jansen, S.; Karthikesalingam, A.; Kelly, C. J.; King, D.; Ledsam, J.R.; Melnick, D.; Mostofi, H.; Peng, L.; Reicher, J. J.; Romera-Paredes, B.; Sidebottom, R.; Suleyman, M.; Tse, D.; Young, K. C.; De, Fauw, J.; Shetty, S. Nature 2020, 577, 7788.
[3] Li, J. G.; Gao, Z. K. Acta Biophysica Sinica 2009, 25, 51(in Chinese). (李建更, 高志坤, 生物物理学报, 2009, 25, 51.)
[4] Leon, F.; Lisa, C.; Curteanu, S. Mol. Cryst. Liq. Cryst. 2010, 518, 1542.
[5] Wang, J.S.; Li, Z.; Yan, S.C.; Yue, X.; Ma, Y.Q.; Ma, L. RSC Adv. 2019, 9, 14797.
[6] Sun, W.B.; Zheng, Y.J.; Yang, K.; Zhang, Q.; Shan, Akeel A.; Wu, Z.; Sun, Y.Y.; Feng, L.; Chen, D.Y.; Lu, S.R.; Li, Y.; Sun, K. Sci. Adv. 2019, 5, 4275.
[7] Zhong, M.; Tran, K.; Min, Y. M.; Wang, C. H.; Wang, Z. Y.; Ding, C. T.; Luna, P.; Sedighian Rasouli, A.; Brodersen, P.; Sun, S.; Voznyy, O.; Tan, C. S.; Askerka, M.; Che, F. L.; Liu, M.; Seifitokaldani, A.; Pang, Y. J.; Lo, S. C.; Sargent, E. Nature 2020, 581, 178.
[8] Wu, W.; Sun, Q. Scientia Sinica Physica, Mechanica & Astronomica 2018, 48, 54(in Chinese). (吴炜, 孙强, 中国科学:物理学力学天文学, 2018, 48, 54.)
[9] Saunders, C.; Stitson, M. O.; Weston, J.; Holloway, R.; Bottou, L.; Scholkopf, B. Comput. Sci. 2002, 1, 1.
[10] Safavian, S. R.; Landgrebe, D. IEEE Trans. Syst., Man, Cybern. 1991, 21, 660.
[11] Hagan, M. T.; Demuth, H. B.; Beale, M. H. Neural Network Design, China Machine Press, Beijing, 2002.
[12] Browne, C. B.; Powley, E.; Whitehouse, D.; Lucas, S. M.; Cowling, P.I. IEEE Transactions on Computational Intelligence & Ai in Games, 2012, 4, 1.
[13] Todeschini, R.; Consonni, V. Molecular Descriptors for Chemoinformatic -Second, Revised and Enlarged Edition, Volume I:Alphabetical Listing; Volume Ⅱ:Appendices, Bibliography, 2009.
[14] Todeschini, R.; Consonni, V. Handbook of Molecular Descriptors, WILEY-VCH, Weinheim, Germany, 2000.
[15] He, B.; Luo, Y.; Li, B. K.; Xue, Y.; Yu, L. T.; Qiu, X. L.; Yang, D. G. Acta Physico-Chimica Sinica 2015, 09, 1795(in Chinese). (何冰, 罗勇, 李秉轲, 薛英, 余洛汀, 邱小龙, 杨登贵, 物理化学学报, 2015, 09, 1795.)
[16] Wang, J. X.; Li, Y.; Yang, M.; Wang, Q. H.; Deng, G. W.; Yang, F.; Li, B. K. Chemical Research & Application 2019, 031, 1313(in Chinese). (王洁雪, 李瑶, 杨敏, 王琪慧, 邓国伟, 杨帆, 李秉轲. 化学研究与应用, 2019, 031, 1313.)
[17] Wang, L.; Mao, H. T.; Zhang, L.; Liu, L. L.; Du, J. CIESC J. 2019, 70, 4722(in Chinese). (王璐, 毛海涛, 张磊, 刘琳琳, 都健, 化工学报, 2019, 70, 4722.)
[18] Dai, Y.; Niu, L.; Zou, J.; Liu, D. Y.; Liu, H. J. Cent. South Univ. 2018, 25, 1535.
[19] Ul-Haq, Z.; Ashraf, S.; Al Majid, A.; Barakat, A. Int. J. Mol. Sci. 2016, 17, 657.
[20] Xu, Y. J.; Pei, J. F. Big Data Research 2017, 003, 45(in Chinese). (徐优俊, 裴剑锋, 大数据, 2017, 003, 45.)
[21] What is the molecular descripto(in Chinese)? 什么是分子描述符? https://zhuanlan.zhihu.com/p/113381716.
[22] Mauri, A.; Consonni, V.; Todeschini, R. Molecular Descriptors, Vol. 8, Eds.:Puzyn, T.; Leszczynski, J.; Cronin, M. T. D., Springer, New York, 2009, p. 33.
[23] Mauri, A.; Consonni, V.; Todeschini, R. Molecular Descriptors, Vol. 8, Eds.:Puzyn, T.; Leszczynski, J.; Cronin, M. T. D., Springer, New York, 2009, p. 34.
[24] Ren, W.; Kong, D. X. Computers & Applied Chemistry, 2009, 11, 1455(in Chinese). (任伟, 孔德信. 计算机与应用化学, 2009, 11, 1455.)
[25] Dickert, F. L.; Hayden, O. Adv. Mater. 2000, 12, 311.
[26] DRAGON http://www.talete.mi.it/
[27] GRID http://www.moldiscovery.com/soft_grid.php
[28] MOLE db http://michem.disat.unimib.it/mole_db/
[29] Stein, H. S.; Gregoire, J.M. Chem. Sci. 2019, 10, 9640.
[30] Mater, A. C.; Coote, M. L. J. Chem. Inf. Model. 2019, 59, 2545.
[31] Isayev, O.; Oses, C.; Toher, C.; Gossett, E.; Curtarolo, S.; Tropsha, A. Nat. Commun. 2017, 8, 15679.
[32] Cova, Tânia F. G. G.; Pais, Alberto A. C. C. Front. Chem. 2019, 7, 809.
[33] Jordan, M. I.; Mitchell, T. M. Science 2015, 349, 6245.
[34] McCulloch, W. S.; Pitts, W. Bull. Math. Biol. 1943, 52.
[35] Gall, J.; Razavi, N.; Van Gool, L. An Introduction to Random Forests for Multi-class Object Detection, Springer-Verlag, Heidelberg, Germany, 2012, pp. 243-263.
[36] Lim, A.; Breiman, L.; Cutler, A. Computer Science 2014(data package and software).
[37] Ahneman, D. T.; Estrada, J. G.; Lin, S. S.; Dreher, S. D.; Doyle, A. G. Science 2018, 360, 6385.
[38] Ghosh, A. K.; Feng, T. J. Appl. Phys. 1973, 44, 2781.
[39] Sun, W.; Li, M.; Li, Y.; Wu, Z.; Sun, Y.; Lu, S.; Xiao, Z.; Zhao, B.; Sun, K. Adv. Theor. Simul. 2019, 2, 1800116.
[40] Segler, M. H. S.; Preuss, M.; Waller, M. P. Nature 2018, 555, 7698.
[41] Yu, Y. B. M.S. Thesis, Dalian Maritime University, Dalian, 2015 (in Chinese). (于永波, 硕士论文, 大连海事大学, 大连, 2015.)
[42] Fu, M. C. In 2016 Winter Simulation Conference, Arlington Virginia, 2016, pp. 659-670.
[43] Xue, Y.; Li, H.; Ung, C. Y.; Yap, C. W.; Chen, Y. Z. Chem. Res. Toxicol. 2006, 19, 1030.
[44] Lü, W. J.; Chen, Y. L.; Ma, W. P.; Zhang, X. Y.; Luan, F.; Liu, M. C.; Chen, X. G.; Hu, Z. D. Eur. J. Med. Chem. 2008, 43, 569.
[45] Lü, W.; Xue, Y. Acta Phys.-Chim. Sin. 2010, 26, 471.
[46] Li, B. K.; Yong, C.; Yang, X. G; Xue, Y.; Chen, Y. Z. Comput. Biol. Med. 43, 395.
[47] Li, B. K.; Cong Y.; Tian, Z. Y.; Xue, Y. Acta Physico-Chimica Sinica 2014, 30, 171(in Chinese). (李秉轲, 丛湧, 田之悦, 薛英, 物理化学学报, 2014, 30, 171.)
[48] Barta, T. E.; Becker, D. P.; Bedell, L. J.; Crescenzo, G. A. D.; McDonald, J. J.; Mehta, P.; Munie, G. E.; Villamil, C. I. Bioorg. Med. Chem. Lett. 2001, 11, 2481.
[49] Xue, D. Z.; Balachandran, P. V.; Hogden, J.; Theiler, J.; Xue, D. Q.; Lookman, T. Nat. Commun. 2016, 7, 11241.
[50] Granda, J. M.; Donina, L.; Dragone, V.; Long, D. L.; Cronin, L. Nature 2018, 559, 7714.
[51] Ding, S. F.; Qi, B. J.; Tan, H. Y. Journal of University of Electronic Science and Technology of China, 2011, 40, 1(in Chinese). (丁世飞, 齐丙娟, 谭红艳, 电子科技大学学报, 2011, 40, 1.)
[52] Burges, C. J. C. A Tutorial on Support Vector Machines for Pattern Recognition. Data Min. Knowl. Discov. 1998, 2, 121.
[53] Qi, H. N. Computer Engineering 2004, 30, 10(in Chinese). (祁亨年, 计算机工程, 2004, 30, 6.)
[54] Zhang, X. W.; Guo, L. Firepower & Command Control 2010, 35, 31(in Chinese). (张先武, 郭雷, 火力与指挥控制, 2010, 35, 31.)
[55] Wang, J. F.; Cao, Y. D. Journal of Beijing Institute of Technology 2001, 21, 225(in Chinese). (王建芬, 曹元大, 北京理工大学学报, 2001, 21, 225.)
[56] Zhang, Q. Y.; Jie, Y.; Li, K. Journal of Computer Applications 2008, 28, 3227(in Chinese). (张秋余, 竭洋, 李凯, 计算机应用, 2008, 28, 3227.)
[57] Butler, K. T.; Davies, D. W.; Cartwright, H.; Isayev, O.; Walsh, A. Nature 2018, 559, 547.
[58] Schütt, K. T.; Gastegger, M.; Tkatchenko, A.; Müller, K. R.; Maurer, R. J. Nat. Commun. 2019, 10, 1.
[59] Ye, S.; Hu, W.; Li, X.; Zhang, J. X.; Zhong, K.; Zhang, G. Z.; Luo, Y.; Mukamel, S.; Jiang, J. Proc. Natl. Acad. Sci. U. S. A. 2019, 116, 11612.
[60] Grisafi, A.; Wilkins, D. M.; Csányi, G.; Ceriotti, M. Phys. Rev. Lett. 2018, 120, 036002.
[61] Thomas, N.; Smidt, T.; Kearnes, S.; Yang, L.; Li L.; Kohlhoff, K. Preprint at https://arxiv.org/abs/1802.08219, 2018.
[62] Ryczko, K.; Strubbe, D. A.; Tamblyn, I. Phys. Rev. A 2019, 100, 022512.
[63] Behler, J.; Parrinello, M. Phys. Rev. Lett. 2018, 98, 146401.
[64] Braams, B. J.; Bowman, J. M. Int. Rev. Phys. Chem. 2009, 28, 577.
[65] Bartók, A. P.; Payne, M. C.; Kondor, R.; Csányi, G. Phys. Rev. Lett. 2010, 104, 136403.
[66] Smith, J. S.; Isayev, O.; Roitberg, A. E. Chem. Sci. 2017, 8, 3192.
[67] Podryabinkin, E. V.; Shapeev, A. V. Comput. Mater. Sci. 2017, 140, 171.
[68] Podryabinkin, E. V.; Tikhonov, E. V.; Shapeev, A. V.; Oganov, A. R. Phys. Rev. B 2019, 99, 064114.
[69] Chmiela, S.; Tkatchenko, A.; Sauceda, H. E.; Poltavsky, I.; Schütt, K. T.; Müller, K. R. Sci. Adv. 2018, 3, e1603015.
[70] Chmiela, S.; Sauceda, H. E.; Müller, K.-R.; Tkatchenko, A. Nat. Commun. 2018, 9, 3887.
[71] Gastegger, M.; Behler, J.; Marquetand, P. Chem. Sci. 2018, 8, 6924.
[72] Dral, P. O. J. Phys. Chem. Lett. 2020, 11, 2336.
[73] Sun, Z. J.; Xue, L.; Xu, Y. M.; Wang, Z. Application Research of Computers 2012, 029, 2806(in Chinese). (孙志军, 薛磊, 许阳明, 王正, 计算机应用研究, 2012, 029, 2806.)
[74] Liu, J. W.; Liu, Y.; Luo, X. L. Application Research of Computers 2014, 031, 1921(in Chinese). (刘建伟, 刘媛, 罗雄麟, 计算机应用研究, 2014, 031, 1921.)
[75] The difference between machine learning and deep learning. (in Chinese). (机器学习和深度学习区别). https://mp.weixin.qq.com/s/h93LO6nlAVOUmG_vmJMGAQ.
[76] Goh, G. B.; Hodas, N. O.; Vishnu, A. J. Comput. Chem. 2017, 38, 1291.
[77] Sun, Y. Z. M.S. Thesis, China Medical University, Shengyang, 2009(in Chinese). (孙也之, 硕士论文, 中国医科大学, 沈阳, 2009.)
[78] Lusci, A.; Pollastri, G.; Baldi, P. J. Chem. Inf. Model. 2013, 53, 1563.
[79] Markoff, J. New York Times, 2012, 10, pp. 1-71.
[80] Mayr, A.; Klambauer, G.; Unterthiner, T.; Hochreiter, S. DeepTox:Front Environ. Sci. Eng. 2016, 3, 80.
[81] Duvenaud, D.; Dougal, M.; Jorge, A. I.; Rafa, G. B.; Timothy, H.; Alán, A. G.; Ryan, P. A. In Proceedings of Advances in Neural Information Processing Systems 28, MIT Press, Montreal, 2015, pp. 2215-2223.
[82] Kanal, L. N.; Randall, N. C. Proceedings of the 1964 19th ACM National Conference, Association for Computing Machinery, New York, NY, USA, 1964, pp. 42.501-42.5020.
[83] Viola, J.; Snow, D.; Jones, M. J. In Proceedings Ninth IEEE International Conference on Computer Vision, Springer-Verlag, Nice, 2003, pp. 734-741.
[84] Riley, P. Nature 2019, 572, 27.
[85] Baltz, E. A.; Trask, E.; Binderbauer, M.; Dikovsky, M.; Gota, H.; Mendoza, R.; Platt, J. C.; Riley, P. F. Sci. Rep. 2017, 7, 6425.
[86] Lu, S.; Zhou, Q.; Guo, Y.; Zhang, Y.; Wu, Y.; Wang, J. Adv. Mater. 2020, 32, 2002658.
[87] Yosinski, J.; Clune, J.; Bengio, Y.; Lipson, H. In International Conference on Neural Information Processing Systems, MIT Press, Siem Reap, 2014, p. 32.
[88] Yamada, H.; Liu, C.; Wu, S.; Koyama, Y.; Ju, S.; Shiomi, J.; Morikawa, J.; Yoshida, R. ACS Cent Sci. 2019, 5, 1717.
[89] Maryasin, B.; Marquetand, P.; Maulide, N. Angew. Chem. Int. Ed. 2018, 57, 6978.
[90] Ward, Charles. 2012. https://www.mgi.gov/
[91] de Pablo, J. J.; Jackson, N. E.; Webb, M. A.; Chen, L. Q.; Moore, J. E.; Morgan, D.; Jacobs, R.; Pollock, T.; Schlom, D. G.; Toberer, E. S.; Analytis, J.; Dabo, I.; DeLongchamp, D. M.; Fiete, G. A.; Grason, G. M.; Hautier, G.; Mo, Y.; Rajan, K.; Reed, E. J.; Zhao, J. C. npj Comput. Mater. 2019, 5, 41.
[92] https://www.nsf.gov/pubs/2017/nsf17036/nsf17036.pdf
[93] http://bigchem.eu/
文章导航

/