Acta Chimica Sinica    

Article

迁移学习预测金属/共价有机骨架材料内小分子气体自扩散系数研究

彭天资a,#, 沈嘉克a,#, 郭淑雅c, 夏潇潇*,d, 李炜*,a,b   

  1. a暨南大学国际能源学院 珠海 519070;
    b暨南大学能源电力研究中心 珠海 519070;
    c暨南大学化学与材料学院广东省重点实验室超分子配位化学研究所 广州 510632;
    d海军工程大学 武汉 430033
  • 投稿日期:2025-11-27
  • 基金资助:
    国家自然科学基金(No. 52306012)和广东省基础与应用基础基金(No. 2025A1515012565)资助.

Transfer Learning Predicted the Self-Diffusion Coefficients of Light-Gas in Metal/Covalent Organic Frameworks

Tianzi Penga,#, Jiake Shena,#, Shuya Guoc, Xiaoxiao Xia*,d, Wei Li*,a,b   

  1. aSchool of International Energy, Jinan University, Zhuhai 519070, China;
    bEnergy and Electric Power Research Center, Jinan University, Zhuhai 519070, China;
    cGuangdong Provincial Key Laboratory of Supramolecular Coordination Chemistry, College of Chemistry and Materials Science, Jinan University, Guangzhou 510632, China;
    dNaval University of Engineering, Wuhan, 430033, China
  • Received:2025-11-27
  • Contact: * E-mail: 2320232062@nue.edu.cn; weili@jnu.edu.cn
  • About author:# These authors contributeb equally to this work.
  • Supported by:
    National Natural Science Foundation of China (Grant No. 52306012); Guangdong Basic and Applied Basic Research Foundation (No. 2025A1515012565).

The self-diffusion coefficient of gas molecules within metal/covalent organic frameworks (MOFs/COFs) is a critical physicochemical property that profoundly impacts their performance in gas storage, separation, chemical catalysis, and so on. Molecular dynamics (MD) simulation is a primary approach to assessing the self-diffusion of light-gas in nanoporous materials. With the explosive number of nanoporous materials, machine learning-assisted computational screening to accelerate the investigation of self-diffusion and explore their structure-property relationship has attracted much attention. However, the asymmetric development of the database between MOFs and other nanoporous materials (such as COFs) led to a data imbalance that challenged the development of machine learning for other porous materials, especially for computation-ready experimental (CoRE) databases. Meanwhile, transfer learning (TL) can mitigate such a challenge to enhance generalization by importing similar information extracted from a well-established database (such as CoRE MOFs). This study employs molecular dynamics simulations to predict the self-diffusion coefficients of eight light gases (H2, CH4, H2S, CO2, N2, C2H6, C3H8, C4H10) in the CoRE MOF database and five light gases (H2, CH4, H2S, CO2, N2) in the CoRE COF database. By utilizing the descriptor obtained from the nanoporous structure and the gas molecule, three ensemble-based and network-based transfer learning algorithms were trained. In detail, there are seven geometric descriptors obtained from the structure, including the largest cavity diameter (LCD), pore limiting diameter (PLD), largest free path diameter (LFPD), density (ρ), unit cell, void fraction (VF) and pore volume (PV), and four chemical descriptors obtained from light gas, including kinetic dynamic (Dia), quadrupole moment (Qua), polarizability (Pol) and dipole moment (Dip). The Two-Stage TrAdaBoost.R2 algorithm is adopted to adjust the parameter for the ensemble model for transfer learning, whereas the fine-tuning strategy is performed for the neural network for TL. Among them, the light gradient boosting machine (LGBM) was identified as a promising transfer learning model for high-accuracy (R2 = 0.802) prediction of the self-diffusion. The kinetic diameter, polarizability of gas molecule, and pore limiting diameter of nanoporous structure are emerging as dominant descriptors with relative importance is 14%, 14%, and 12%, in which small Dia, Pol, and large PLD benefit the diffusion. Furthermore, the transfer learning LGBM model can predict the self-diffusion of three types of gas (C2H6, C3H8, C4H10) with a Spearman's correlation coefficient (SRCC) equal to 0.821. This work validates the feasibility of transfer learning-assisted high-throughput screening, offering a feasible approach for deep learning and cross-material studies of nanoporous materials under data scarcity constraints.

Key words: Covalent-Organic Frameworks, Metal-Organic Frameworks, Self-Diffusion Coefficient, Transfer Learning, Molecular Dynamics