化学学报 ›› 2022, Vol. 80 ›› Issue (5): 614-624.DOI: 10.6023/A22010031 上一篇    下一篇

研究论文

机器学习与分子模拟协同的CH4/H2分离金属有机框架高通量计算筛选

王诗慧, 薛小雨, 程敏, 陈少臣, 刘冲, 周利, 毕可鑫, 吉旭*()   

  1. 四川大学化学工程学院 成都 610065
  • 投稿日期:2022-01-16 发布日期:2022-05-31
  • 通讯作者: 吉旭
  • 基金资助:
    国家自然科学基金青年基金(22108178)

High-Throughput Computational Screening of Metal-Organic Frameworks for CH4/H2 Separation by Synergizing Machine Learning and Molecular Simulation

Shihui Wang, Xiaoyu Xue, Min Cheng, Shaochen Chen, Chong Liu, Li Zhou, Kexin Bi, Xu Ji()   

  1. School of Chemical Engineering, Sichuan University, Chengdu 610065
  • Received:2022-01-16 Published:2022-05-31
  • Contact: Xu Ji
  • Supported by:
    Young Scientists Fund of the National Natural Science Foundation of China(22108178)

在减少CO2排放、实现碳中和的背景下, 金属有机框架(MOFs)在清洁能源领域展现出广阔应用前景. 提出一种机器学习和分子模拟协同的分层筛选策略, 快速、准确地从134185个假设MOFs中识别出具有最佳CH4/H2分离性能的吸附剂. 首先, 根据MOFs的结构性质, 筛掉孔径或体积比表面积不恰当的吸附剂, 初筛后MOFs的数量减至62278个. 接下来, 抽取10% MOFs将结构和化学混合描述符作为特征, 利用随机森林分别构建变压吸附和真空变压吸附过程中其对CH4的吸附剂性能得分(APS)预测模型. 相比于其他模型构建策略, 基于本策略构建的模型具有最优预测准确性, 可用于余下MOFs的性能预测. 随后根据APS预测值排序, 筛选出Top 1000的MOFs并利用分子模拟修正预测结果, 进一步确定了10个最佳MOFs. 最后, 对描述符的重要性进行解释, 揭示了实现模型在不同操作场景下的迁移具有潜力, 为未来开发适用于多操作场景下的高性能MOFs筛选方法提供了一条高效的研究路径和方法.

关键词: 金属有机框架, CH4/H2分离, 分子模拟, 机器学习, 可解释性

In this work, a hierarchical screening strategy by synergizing machine learning (ML) and molecular simulation was proposed to identify the optimal adsorbents for CH4/H2 separation from 134185 hypothetical metal-organic frameworks (MOFs). At the initial screening, MOF materials with inappropriate pore size and/or volumetric surface area were removed from the total database, resulting in a list of 62278 MOFs. Among them, 10% MOFs were randomly chosen and grand canonical Monte Carlo (GCMC) simulations were performed to calculate the adsorption behaviors of CH4/H2 mixture in these MOFs under vacuum swing adsorption (VSA) and pressure swing adsorption (PSA) conditions. Following this, structural/ chemical descriptors and corresponding adsorbent performance scores (APS) of the selected MOFs were employed to develop the random forest (RF) models for VSA and PSA processes. Compared with the accuracy of other ML algorithms, covering support vector machine, k-nearest neighbor, decision tree, and artificial neural network, the proposed model exhibits the optimum predictive power. Meanwhile, the hybrid of structural and chemical descriptors, as well as the application of the preliminary screening strategy improve the accuracy of the RF model. Thus, it was used to predict the APS values of the remaining 90% MOFs in the next stage of screening, and the top 1000 candidates were screened out according to the results. GCMC simulations were subsequently carried out on the top candidates to refine the predictions, and then ten MOFs with the best CH4/H2 separation performance were obtained under VSA and PSA conditions, respectively. The high performance of the optimal MOFs was verified by comparison with well-studied MOF materials in the literature. Finally, the feature importance of the descriptors was interpreted by the Shapley Additive Explanations. The result reveals the potential for the developed model to transfer between the two operating conditions due to the consistency of the dominant descriptors, which also provides an efficient pathway for rapid screening of promising MOF adsorbents in CH4/H2 separation suitable for different operation scenarios.

Key words: metal-organic frameworks, CH4/H2 separation, molecular simulation, machine learning, interpretability