化学学报 ›› 2024, Vol. 82 ›› Issue (4): 387-395.DOI: 10.6023/A23100473 上一篇 下一篇
研究论文
投稿日期:
2023-10-27
发布日期:
2024-01-05
基金资助:
Junqing Li, Qianxi Song, Ziyi Liu, Dongqi Wang*()
Received:
2023-10-27
Published:
2024-01-05
Contact:
* E-mail: Supported by:
文章分享
近年来, 含硼材料在新能源、催化等领域日益受到重视, 然而, 对于高附加值的含硼材料发展还存在很高的技术壁垒. 因此, 亟需深入研究含硼材料微观性质间的关联关系, 推动高端含硼材料的研发. 本工作面向材料研究从传统的试错法向数据驱动的研究范式转变的需求, 通过特征选择、网格搜索优化以及特征重要性分析, 探索了多种重要的机器学习算法在含硼材料能隙预测中的应用. 结果表明, 采用随机森林算法的能隙预测模型决定系数(R2)可达0.84, 并发现含硼材料的总磁化强度(total magnetization)特征与能隙存在显著的负相关关系, 即材料的总磁化强度越小, 其能隙越大. 本工作表明机器学习方法可用于定向设计具有特定能隙的含硼材料. 同时, 结果也表明, 作为一种集成学习模型, 随机森林具有较好的学习能力与稳定的预测性能, 可以应用到其它类型材料体系的能隙以及其它材料属性的预测, 加速材料性能的设计与优化过程, 对新型功能材料的快速筛选与高性能预测具有重要的科学意义.
李珺卿, 宋千禧, 刘子义, 王东琪. 机器学习方法预测含硼材料能隙[J]. 化学学报, 2024, 82(4): 387-395.
Junqing Li, Qianxi Song, Ziyi Liu, Dongqi Wang. Machine Learning for Predicting Band Gap in Boron-containing Materials[J]. Acta Chimica Sinica, 2024, 82(4): 387-395.
Features | Descriptions |
---|---|
total magnetization | total magnetizing strength of the material |
dens | density of the material |
energy | energy of the material |
energy per atom | atomic energy per unit of energy |
formation energy per atom | unit atomic formation energy |
e above hull | phase stability energy (difference in energy between new phase and equilibrium phase) |
minimum Number | minimum atomic number |
minimum Mendeleev Number | minimum value of Mendeleev number of elements in compounds |
maximum Melting T | compound melting temperature maximum |
maximum Nd Valence | maximum number of d-valence orbitals filled |
mean Np Unfilled | mean number of unfilled p-valence orbitals |
maximum Nd Unfilled | maximum number of unfilled d-valence orbitals |
maximum Nf Unfilled | maximum number of unfilled f-valence orbits |
maximum Space Group Number | maximum space group of the ground state structure at 0 K |
minimum oxidation state | minimum oxidation state |
Features | Descriptions |
---|---|
total magnetization | total magnetizing strength of the material |
dens | density of the material |
energy | energy of the material |
energy per atom | atomic energy per unit of energy |
formation energy per atom | unit atomic formation energy |
e above hull | phase stability energy (difference in energy between new phase and equilibrium phase) |
minimum Number | minimum atomic number |
minimum Mendeleev Number | minimum value of Mendeleev number of elements in compounds |
maximum Melting T | compound melting temperature maximum |
maximum Nd Valence | maximum number of d-valence orbitals filled |
mean Np Unfilled | mean number of unfilled p-valence orbitals |
maximum Nd Unfilled | maximum number of unfilled d-valence orbitals |
maximum Nf Unfilled | maximum number of unfilled f-valence orbits |
maximum Space Group Number | maximum space group of the ground state structure at 0 K |
minimum oxidation state | minimum oxidation state |
Models | 145 features | 15 features | ||||
---|---|---|---|---|---|---|
MAE | MSE | R2 | MAE | MSE | R2 | |
Lasso | 1.24 | 2.38 | 0.31 | 1.38 | 2.65 | 0.20 |
Bayesian Ridge | 0.87 | 1.48 | 0.57 | 1.10 | 1.96 | 0.41 |
Elastic Net | 1.17 | 2.18 | 0.37 | 1.34 | 2.57 | 0.22 |
Support Vector Regression | 1.38 | 2.92 | 0.15 | 1.50 | 3.25 | 0.02 |
Decision Tree Regression | 0.67 | 1.22 | 0.65 | 0.65 | 1.02 | 0.69 |
Gradient Boosting Regression | 0.58 | 0.66 | 0.81 | 0.61 | 0.70 | 0.79 |
AdaBoost Regression | 0.91 | 1.21 | 0.65 | 0.89 | 1.22 | 0.63 |
Random Forest Regression | 0.51 | 0.58 | 0.83 | 0.51 | 0.57 | 0.84 |
Extra Tree Regression | 0.67 | 1.20 | 0.65 | 0.63 | 1.07 | 0.68 |
Models | 145 features | 15 features | ||||
---|---|---|---|---|---|---|
MAE | MSE | R2 | MAE | MSE | R2 | |
Lasso | 1.24 | 2.38 | 0.31 | 1.38 | 2.65 | 0.20 |
Bayesian Ridge | 0.87 | 1.48 | 0.57 | 1.10 | 1.96 | 0.41 |
Elastic Net | 1.17 | 2.18 | 0.37 | 1.34 | 2.57 | 0.22 |
Support Vector Regression | 1.38 | 2.92 | 0.15 | 1.50 | 3.25 | 0.02 |
Decision Tree Regression | 0.67 | 1.22 | 0.65 | 0.65 | 1.02 | 0.69 |
Gradient Boosting Regression | 0.58 | 0.66 | 0.81 | 0.61 | 0.70 | 0.79 |
AdaBoost Regression | 0.91 | 1.21 | 0.65 | 0.89 | 1.22 | 0.63 |
Random Forest Regression | 0.51 | 0.58 | 0.83 | 0.51 | 0.57 | 0.84 |
Extra Tree Regression | 0.67 | 1.20 | 0.65 | 0.63 | 1.07 | 0.68 |
Models | Number of elemental species | ||||||
---|---|---|---|---|---|---|---|
2 | 3 | 4 | 5 | 6 | 7 | 8 | |
Linear Regression | 0.42 | 0.44 | 0.46 | 0.26 | 0.33 | –0.01 | –2.15 |
Lasso | –0.07 | 0.16 | 0.23 | –0.06 | 0.05 | –1.14 | –1.2 |
Bayesian Ridge | 0.42 | 0.45 | 0.46 | 0.25 | 0.33 | 0 | –2.09 |
Elastic Net | –0.11 | 0.2 | 0.28 | –0.1 | 0.1 | –1.52 | –0.83 |
Support Vector Regression | –0.14 | 0.02 | 0.01 | –0.18 | –0.25 | –0.72 | –1.43 |
Decision Tree Regression | 0.55 | 0.76 | 0.78 | 0.47 | 0.5 | 0.6 | –4.75 |
Gradient Boosting Regression | 0.64 | 0.81 | 0.81 | 0.73 | 0.57 | 0.07 | –2.48 |
AdaBoost Regression | 0.45 | 0.62 | 0.66 | 0.53 | 0.43 | –0.65 | –2.65 |
Random Forest Regression | 0.72 | 0.84 | 0.86 | 0.73 | 0.68 | 0.37 | –1.9 |
Extra Tree Regression | 0.25 | 0.77 | 0.63 | 0.42 | –0.29 | 0.84 | –0.46 |
Models | Number of elemental species | ||||||
---|---|---|---|---|---|---|---|
2 | 3 | 4 | 5 | 6 | 7 | 8 | |
Linear Regression | 0.42 | 0.44 | 0.46 | 0.26 | 0.33 | –0.01 | –2.15 |
Lasso | –0.07 | 0.16 | 0.23 | –0.06 | 0.05 | –1.14 | –1.2 |
Bayesian Ridge | 0.42 | 0.45 | 0.46 | 0.25 | 0.33 | 0 | –2.09 |
Elastic Net | –0.11 | 0.2 | 0.28 | –0.1 | 0.1 | –1.52 | –0.83 |
Support Vector Regression | –0.14 | 0.02 | 0.01 | –0.18 | –0.25 | –0.72 | –1.43 |
Decision Tree Regression | 0.55 | 0.76 | 0.78 | 0.47 | 0.5 | 0.6 | –4.75 |
Gradient Boosting Regression | 0.64 | 0.81 | 0.81 | 0.73 | 0.57 | 0.07 | –2.48 |
AdaBoost Regression | 0.45 | 0.62 | 0.66 | 0.53 | 0.43 | –0.65 | –2.65 |
Random Forest Regression | 0.72 | 0.84 | 0.86 | 0.73 | 0.68 | 0.37 | –1.9 |
Extra Tree Regression | 0.25 | 0.77 | 0.63 | 0.42 | –0.29 | 0.84 | –0.46 |
[1] |
Fujimori, M.; Nakata, T.; Nakayama, T.; Nishibori, E.; Kimura, K.; Takata, M.; Sakata, M. Phys. Rev. Lett. 1999, 82, 4452.
doi: 10.1103/PhysRevLett.82.4452 |
[2] |
Shen, Y. F.; Xu, C.; Huang, M.; Wang, H. Y.; Cheng, L. J. Prog. Chem. 2016, 28, 1601. (in Chinese)
|
(沈艳芳, 徐畅, 黄敏, 王海燕, 程龙玖, 化学进展, 2016, 28, 1601.)
doi: 10.7536/PC160533 |
|
[3] |
Yang, X. Q.; Hu, Y.; Zhang, J. L.; Wang, Y. Q.; Pei, C. M.; Liu, F. Acta Physica Sinica 2014, 63, 048102. (in Chinese)
doi: 10.7498/aps |
(杨秀清, 胡亦, 张景路, 王艳秋, 裴春梅, 刘飞, 物理学报, 2014, 63, 048102.)
|
|
[4] |
Rubio, A.; Corkill, J. L.; Cohen, M. L. Phys. Rev. B 1994, 49, 5081.
pmid: 10011453 |
[5] |
Feng, B.; Zhang, J.; Zhong, Q.; Li, W.; Li, S.; Li, H.; Cheng, P.; Meng, S.; Chen, L.; Wu, K. Nat. Chem. 2016, 8, 563.
doi: 10.1038/nchem.2491 |
[6] |
Li, P.; Zhang, X.; Wang, J.; Xue, Y.; Yao, Y.; Chai, S.; Zhou, B.; Wang, X.; Zheng, N.; Yao, J. J. Am. Chem. Soc. 2022, 144, 5930.
doi: 10.1021/jacs.1c13563 |
[7] |
Hao, K. R.; Yan, Q. B.; Su, G. Phys. Chem. Chem. Phys. 2020, 22, 709.
doi: 10.1039/C9CP05318B |
[8] |
Cheng, Z. S.; Zhang, X. M.; Zhang, H.; Liu, H. Y.; Yu, X.; Dai, X. F.; Liu, G. D.; Chen, G. F. J. Phys. Chem. C 2022, 126, 21542.
doi: 10.1021/acs.jpcc.2c06346 |
[9] |
Zhan, C.; Zhang, P. F.; Dai, S.; Jiang, D. E. ACS Energy Lett. 2016, 1, 1241.
doi: 10.1021/acsenergylett.6b00483 |
[10] |
Qiu, B.; Lu, W. D.; Gao, X. Q.; Sheng, J.; Ji, M.; Wang, D. Q.; Lu, A. H. J. Catal. 2023, 417, 14.
doi: 10.1016/j.jcat.2022.11.031 |
[11] |
Grant, J. T.; Carrero, C. A.; Goeltl, F.; Venegas, J.; Mueller, P.; Burt, S. P.; Specht, S. E.; Mcdermott, W. P.; Chieregato, A.; Hermans, I. Science 2016, 354, 1570.
doi: 10.1126/science.aaf7885 pmid: 27934702 |
[12] |
Lu, X.; Li, K.; Xie, Y.; Qi, S.; Shen, Q.; Yu, J.; Huang, L.; Zheng, X. J. Biomed. Mater. Res. Part A 2019, 107, 12.
doi: 10.1002/jbm.a.v107.1 |
[13] |
Liu, L.; Zhao, Z.; Yu, T.; Zhang, S.; Lin, J.; Yang, G. J. Phys. Chem. C 2018, 122, 6801.
doi: 10.1021/acs.jpcc.8b00252 |
[14] |
Gao, Y.; Ma, Y. J. Phys. Chem. C 2019, 123, 23145.
doi: 10.1021/acs.jpcc.9b03599 |
[15] |
Tian, X. X.; Xuan, X. Y.; Yu, M.; Mu, Y. W.; Lu, H. G.; Zhang, Z. H.; Li, S. D. Nanoscale 2019, 11, 11099.
doi: 10.1039/C9NR02681A |
[16] |
Xu, L.; Wang, A.; Li, B.; Zhao, J.; Zeng, H.; Zhang, S. J. Phys. Chem. Lett. 2022, 13, 6455.
doi: 10.1021/acs.jpclett.2c01882 |
[17] |
Yun, J.; Zhang, Y.; Xu, M.; Yan, J.; Zhao, W.; Zhang, Z. J. Mater. Sci. 2017, 52, 10294.
doi: 10.1007/s10853-017-1233-0 |
[18] |
Xu, J.; Wan, Q.; Anpo, M.; Lin, S. J. Phys. Chem. C 2020, 124, 6624.
doi: 10.1021/acs.jpcc.9b11385 |
[19] |
Chung, H. Y.; Weinberger, M. B.; Levine, J. B.; Cumberland, R. W.; Kavner, A.; Yang, J. M.; Tolbert, S. H.; Kaner, R. B. Science 2007, 316, 436.
doi: 10.1126/science.1139322 |
[20] |
Yao, Y.; Zhang, Z.; Jiao, L. Energy Environ. Mater. 2021, 5, 470.
doi: 10.1002/eem2.v5.2 |
[21] |
Gabani, S.; Flachbart, K.; Siemensmeyer, K.; Mori, T. J. Alloys Compd. 2020, 821, 153201.
doi: 10.1016/j.jallcom.2019.153201 |
[22] |
Yan, X.; Jin, Q.; Jiang, Y.; Yao, T.; Li, X.; Tao, A.; Gao, C.; Chen, C.; Ma, X.; Ye, H. ACS Appl. Mater. Interfaces 2022, 14, 36875.
doi: 10.1021/acsami.2c10143 |
[23] |
Curtarolo, S.; Hart, G. L. W.; Nardelli, M. B.; Mingo, N.; Sanvito, S.; Levy, O. Nat. Mater. 2013, 12, 191.
doi: 10.1038/nmat3568 pmid: 23422720 |
[24] |
Draxl, C.; Scheffler, M. J. Phys. Mater 2019, 2, 036001.
doi: 10.1088/2515-7639/ab13bb |
[25] |
Jain, A.; Ong, S. P.; Hautier, G.; Chen, W.; Richards, W. D.; Dacek, S.; Cholia, S.; Gunter, D.; Skinner, D.; Ceder, G.; Persson, K. A. APL Mater. 2013, 1, 011002.
doi: 10.1063/1.4812323 |
[26] |
Mehl, M. J.; Hicks, D.; Toher, C.; Levy, O.; Hanson, R. M.; Hart, G.; Curtarolo, S. Comput. Mater. Sci. 2017, 136, S1-S828.
doi: 10.1016/j.commatsci.2017.01.017 |
[27] |
De Pablo, J. J.; Jackson, N. E.; Webb, M. A.; Chen, L.-Q.; Moore, J. E.; Morgan, D.; Jacobs, R.; Pollock, T.; Schlom, D. G.; Toberer, E. S.; Analytis, J.; Dabo, I.; Delongchamp, D. M.; Fiete, G. A.; Grason, G. M.; Hautier, G.; Mo, Y.; Rajan, K.; Reed, E. J.; Rodriguez, E.; Stevanovic, V.; Suntivich, J.; Thornton, K.; Zhao, J.-C. npj Comput. Mater. 2019, 5, 41.
doi: 10.1038/s41524-019-0173-4 |
[28] |
Hansen, K.; Montavon, G.; Biegler, F.; Fazli, S.; Rupp, M.; Scheffler, M.; Von Lilienfeld, O. A.; Tkatchenko, A.; Müller, K.-R. J. Chem. Theory Comput. 2013, 9, 3404.
doi: 10.1021/ct400195d pmid: 26584096 |
[29] |
Wei, X. H.; Zhou, C. B.; Shen, X. X.; Liu, Y. Y.; Tong, Q. C. Journal of Jilin University (Engineering and Technology Edition), 2021, 51, 667. (in Chinese)
|
(魏晓辉, 周长宝, 沈笑先, 刘圆圆, 童群超, 吉林大学学报(工学版), 2021, 51, 667.)
|
|
[30] |
Wang, Y.; Lv, J.; Zhu, L.; Ma, Y. Comput. Phys. Commun. 2012, 183, 2063.
doi: 10.1016/j.cpc.2012.05.008 |
[31] |
Huang, Y.; Yu, C.; Chen, W.; Liu, Y.; Li, C.; Niu, C.; Wang, F.; Jia, Y. J. Mater. Chem. C 2019, 7, 3238.
doi: 10.1039/c8tc05554h |
[32] |
Dey, P.; Bible, J.; Datta, S.; Broderick, S.; Jasinski, J.; Sunkara, M.; Menon, M.; Rajan, K. Comput. Mater. Sci. 2014, 83, 185.
doi: 10.1016/j.commatsci.2013.10.016 |
[33] |
Xu, Y. L.; Wang, X. M.; Li, X.; Xi, L. L.; Ni, J. Y.; Zhu, W. H.; Zhang, W.; Yang, J. Sci. Sin. Tech. 2019, 49, 44. (in Chinese)
doi: 10.1360/N092018-00202 |
(徐永林, 王香蒙, 李鑫, 席丽丽, 倪剑樾, 朱文浩, 张武, 杨炯, 中国科学:技术科学, 2019, 49, 44.)
|
|
[34] |
Ong, S. P.; Richards, W. D.; Jain, A.; Hautier, G.; Kocher, M.; Cholia, S.; Gunter, D.; Chevrier, V. L.; Persson, K. A.; Ceder, G. Comput. Mater. Sci. 2013, 68, 314.
doi: 10.1016/j.commatsci.2012.10.028 |
[35] |
Hauke, J.; Kossowski, T. Quaest. Geogr. 2011, 30, 87.
|
[36] |
Lundberg, S. M.; Erion, G.; Chen, H. Nat. Mach. Intell. 2020, (2), 56.
|
[37] |
Ward, L.; Dunn, A.; Faghaninia, A.; Zimmermann, N. E. R.; Bajaj, S.; Wang, Q.; Montoya, J.; Chen, J.; Bystrom, K.; Dylla, M.; Chard, K.; Asta, M.; Persson, K. A.; Snyder, G. J.; Foster, I.; Jain, A. Comput. Mater. Sci. 2018, 152, 60.
doi: 10.1016/j.commatsci.2018.05.018 |
[38] |
Ward, L.; Agrawal, A.; Choudhary, A.; Wolverton, C. npj Comput. Mater. 2016, 2, 16028.
doi: 10.1038/npjcompumats.2016.28 |
[1] | 戚兴怡, 胡耀峰, 王若愚, 杨雅清, 赵宇飞. 机器学习在新材料筛选方面的应用进展[J]. 化学学报, 2023, 81(2): 158-174. |
[2] | 韩逸之, 蓝建慧, 刘学, 石伟群. 基于机器学习势函数的熔盐体系分子动力学研究进展[J]. 化学学报, 2023, 81(11): 1663-1672. |
[3] | 程敏, 王诗慧, 罗磊, 周利, 毕可鑫, 戴一阳, 吉旭. 面向乙烷/乙烯分离的金属有机框架膜的大规模计算筛选[J]. 化学学报, 2022, 80(9): 1277-1288. |
[4] | 刘雨泽, 李昆华, 黄佳兴, 于曦, 胡文平. 多组件学习器实现有机分子沸点的精准预测[J]. 化学学报, 2022, 80(6): 714-723. |
[5] | 王诗慧, 薛小雨, 程敏, 陈少臣, 刘冲, 周利, 毕可鑫, 吉旭. 机器学习与分子模拟协同的CH4/H2分离金属有机框架高通量计算筛选[J]. 化学学报, 2022, 80(5): 614-624. |
[6] | 蔡铖智, 李丽凤, 邓小梅, 李树华, 梁红, 乔智威. 基于机器学习和高通量计算筛选金属有机框架的甲烷/乙烷/丙烷分离性能[J]. 化学学报, 2020, 78(5): 427-436. |
[7] | 朱博阳, 吴睿龙, 于曦. 人工智能助力当代化学研究[J]. 化学学报, 2020, 78(12): 1366-1382. |
[8] | 刘治鲁, 李炜, 刘昊, 庄旭东, 李松. 金属有机骨架的高通量计算筛选研究进展[J]. 化学学报, 2019, 77(4): 323-339. |
[9] | 叶素玉, 彭亮, 宋柯晟, 顾凤龙. 卤素掺杂聚甲基苯基硅烷的电子结构的理论研究[J]. 化学学报, 2013, 71(02): 271-278. |
[10] | 黄多辉,a,b 王藩侯a 朱正和c. 外电场下氮化铝分子结构和光谱研究[J]. 化学学报, 2008, 66(13): 1599-1603. |
[11] | 饶含兵, 李泽荣, 陈晓梅, 李象远. 基于支持向量学习机的HIV-1蛋白酶抑制剂的活性预测[J]. 化学学报, 2007, 65(3): 197-202. |
[12] | 武海顺,张竹霞. 内含式化合物X@Al12P12的结构与稳定性研究[J]. 化学学报, 2005, 63(11): 973-978. |
[13] | 江元生,许建农,朱宏耀. 分子的能级模式指标[J]. 化学学报, 1994, 52(7): 625-633. |
阅读次数 | ||||||
全文 |
|
|||||
摘要 |
|
|||||