化学学报 ›› 2024, Vol. 82 ›› Issue (2): 138-145.DOI: 10.6023/A23110496 上一篇    下一篇

研究论文

结合容错编码的量子化学分布式计算

李宁a,b, 徐丽娜b, 方国勇b,*(), 马英晋a,*()   

  1. a 中国科学院计算机网络信息中心 北京 100190
    b 温州大学化学与材料工程学院 温州 325035
  • 投稿日期:2023-11-13 发布日期:2024-01-23
  • 基金资助:
    国家自然科学基金(22173114); 国家自然科学基金(22333003); 中科院先导专项(XDB0500001); 中科院青促会专项基金(2022168); 中科院信息化专项(CAS-WX2021SF-0103-02); 中科院计算机网络信息中心所级项目(CNIC20230201)

Fault-tolerant Coded Quantum Chemical Distributed Calculation

Ning Lia,b, Lina Xub, Guoyong Fangb(), Yingjin Maa()   

  1. a Computer Network Information Center, Chinese Academy of Sciences, Beijing 100190
    b College of Chemistry and Materials Engineering, Wenzhou University, Wenzhou 325035
  • Received:2023-11-13 Published:2024-01-23
  • Contact: E-mail: fanggy@wzu.edu.cn (13868600593);yingjin.ma@sccas.cn (13261370353)
  • Supported by:
    National Natural Science Foundation of China(22173114); National Natural Science Foundation of China(22333003); Strategic Priority Research Program of Chinese Academy of Sciences(XDB0500001); Youth Innovation Promotion Association of Chinese Academy of Sciences(2022168); Network and Information Foundation of Chinese Academy of Sciences(CAS-WX2021SF-0103-02); Project of Computer Network Information Center, Chinese Academy of Sciences(CNIC20230201)

随着大尺度模拟、机器学习等前沿应用的兴起, 分布式计算越发成为重要的计算研究手段. 然而分布式计算由于多节点导致的软硬件局限, 在科学计算、机器学习等领域的应用仍会存在一些问题. 本工作将编码分布式计算应用到量子化学领域, 通过借鉴梯度编码方案, 一方面解决分布式量子化学计算中的掉队节点问题; 另一方面增加量子化学分布式计算的自动纠错能力, 减少计算过程耗费的人力物力, 以期实现自动化的容错量子化学计算. 此外, 也提出了编码复用的计算思路, 能够简单有效地使用更多的计算资源在设定的容错能力上进行分布式计算. 最后将此计算方案应用到计算P38蛋白与配体的结合能上, 将使用编码计算得到的结果与真实的结果进行对比, 验证此方案的准确性及其在自动化容错量子化学计算方面的应用潜力.

关键词: 量子化学计算, 编码计算, 梯度编码, 容错, 自动化, 结合能

With the rise of cutting-edge applications such as large-scale simulation and machine learning, distributed computing has become more and more an important means of computational research. However, distributed computing will still have some problems in the application of scientific computing, machine learning and other fields due to the hardware and software limitations caused by multiple nodes. In this paper, we apply coded distributed computing to the field of quantum chemistry, by drawing on the gradient coding scheme, on the one hand, to solve the problem of dropped nodes in distributed quantum chemical computation; on the other hand, to increase the automatic error correction capability of quantum chemical distributed computation, to reduce the manpower and resources consumed in the computation process, with a view to realizing the automated fault-tolerant quantum chemical computation. In addition, we also propose the computational idea of coded multiplexing, which can simply and effectively use more computational resources to perform distributed computation on a set fault-tolerant capacity. We applied this computational scheme to calculate the binding energy of P38 protein and ligand by artificially specifying the use of four computational nodes for each fragment and allowing one dropout node, and compared the results obtained using coded computation with the real results, and found that the error was extremely small and negligible, and the correct results could be obtained even in the case of one dropout node or one node being miscalculated. In order to verify whether this scheme is suitable for larger scale distributed quantum chemical computation, we further randomly selected 10 fragments on the basis of coded multiplexing and performed the computation with 40 nodes at the same time, and found that the obtained results are also very accurate. Finally we calculated the binding energy of the P38 protein to the ligand, and the results obtained were consistent with previous literature, demonstrating the accuracy of this scheme and its potential for application in automated fault-tolerant quantum chemical calculations.

Key words: quantum chemical calculation, coded computing, gradient coding, fault-tolerant, automation, binding energy