化学学报 ›› 2020, Vol. 78 ›› Issue (5): 427-436.DOI: 10.6023/A20030065 上一篇    下一篇

研究论文

基于机器学习和高通量计算筛选金属有机框架的甲烷/乙烷/丙烷分离性能

蔡铖智, 李丽凤, 邓小梅, 李树华, 梁红, 乔智威   

  1. 广州大学化学化工学院 能源与催化研究所 广州 510006
  • 投稿日期:2020-03-13 发布日期:2020-04-16
  • 通讯作者: 乔智威 E-mail:zqiao@gzhu.edu.cn
  • 基金资助:
    项目受国家自然科学基金(Nos.21978058,21676094,21576058)和广东省自然科学基金(No.2020A1515010800)资助.

Machine Learning and High-throughput Computational Screening of Metal-organic Framework for Separation of Methane/ethane/propane

Cai Chengzhi, Li Lifeng, Deng Xiaomei, Li Shuhua, Liang Hong, Qiao Zhiwei   

  1. Guangzhou Key Laboratory for New Energy and Green Catalysis, School of Chemistry and Chemical Engineering, Guangzhou University, Guangzhou 510006
  • Received:2020-03-13 Published:2020-04-16
  • Supported by:
    Project supported by the National Natural Science Foundation of China (Nos. 21978058, 21676094, 21576058) and the Natural Science Foundation of Guangdong Province (No. 2020A1515010800).

针对天然气中的甲烷、乙烷、丙烷(C1、C2、C3)气体分离困难的问题,本工作采用高通量计算了137953种假设的金属有机框架(Metal-organic framework,MOF)对这三种混合气体的吸附分离吸能.为了避免水蒸气的竞争吸附,首先,筛选出31399种疏水性MOF.然后,单变量分析了这些MOF的最大孔径(LCD)、孔隙率(φ)、体积比表面积(VSA)、亨利系数(K)、吸附热(Qst)、密度(ρ)共六种MOF结构/能量描述符与MOF对C1、C2、C3的选择性、吸附量及两者权衡值(Trade-off between Si/j and Ni,TSN)的关系,发现了吸附量和选择性“第二峰值”的存在;尤其对于C1、C2的分离,所有最优MOF都分布在第二峰值区间.随后采用决策树、随机森林(Random forest,RF)、支持向量机和反向传播神经网络四种机器学习算法,分别训练并预测了六种MOF描述符与性能指标的关系,结果表明RF预测效果最好.然后应用RF算法定量地分析出K、LCD和ρ三种描述符对TSNC1、TSNC2的相对重要性最高,而TSNC3的是KQstρ,根据这些描述符分别设计了吸附C1、C2、C3最优MOF的决策树模型路径.最后筛选出针对C1、C2和C3不同分离应用的18种最优MOF.本工作基于机器学习和高通量计算的研究思路和研究方法,第二峰值规律的发现以及最优设计路线的提出将有助于MOF在吸附分离领域的发展提供有力的指导和启示.

关键词: 金属有机框架, 气体分离, 分子模拟, 机器学习

In this work, the separation performance of methane/ethane/propane (C1, C2 and C3) mixture in the 137953 hypothetical metal-organic frameworks (MOFs) is calculated by high throughput computational screening and multiple machine learning (ML) algorithms. First, to avoid the competitive adsorption of water vapor, 31399 hydrophobic MOFs (hMOFs) were screened out. Then, grand canonical Monte Carlo (GCMC) simulations were employed to calculate the adsorption behavior of a mixture with a mole ratio of C1:C2:C3=7:2:1 in these hMOFs, respectively. Second, the relationships among six MOF structures/energy descriptors (the largest cavity diameter (LCD), void fraction (f), volumetric surface area (VSA), Henry coefficient (K), heat of adsorption (Qst), density of MOF (ρ)) and three performance indicators of MOFs (selectivities (S), adsorption capacities (N) of C1, C2, C3 and their trade-offs (TSN)) were established. The LCDs were calculated by Zeo++software, and VSAs were calculated using RASPA software using He and N2 as probes, respectively, and Qst and K were calculated in an infinite dilution of each gas molecule in an infinite dilution state using NVT-MC method in RASPA software. Then, we found that there existed the "second peaks" of N and S in part of structure-property relationships, and all the optimal MOFs located in the range of second peaks, especially for the separation of C1 or C2. Third, the above-mentioned six MOF descriptors and three MOF performance indicators were trained, tested and predicted by four ML algorithms, including decision tree, random forest (RF), support vector machine and Back Propagation neural network. Although the predictive effect for the selectivity was very low, the introduction of TSN can significantly improve the accuracy of ML prediction, especially for RF algorithm (R=0.99). Therefore, the RF was used to quantitatively analyze the relative importance of each MOF descriptor, and found that three descriptors (K, LCD and ρ) possessed the highest importance for the separation of C1 and C2, and three other descriptors (K, Qst and ρ) for the separation of C3. Moreover, three simple and clear paths of optimal MOFs for C1, C2 and C3 adsorption were designed by the decision tree model with the descriptors. Based on those paths, there were 96%, 85%, 95% probability that we can search for high-performance MOFs, respectively. Finally, the best 18 MOFs were identified for different separation applications of C1, C2 and C3. This study reveals the second peaks and key MOF descriptors governing the adsorption of light alkane, develops quantitative structure-property relationships by ML, and identifies the best adsorbents from a large collection of MOFs for the separation of C1, C2 and C3 from natural gas.

Key words: metal-organic framework, gas separation, molecular simulation, machine learning