Npj Comput. Mater.: 原子间势—准确性和可迁移性

2023-03-21 03:03

海归学者发起的公益学术平台

分享信息，整合资源

交流学术，偶尔风月

原子间势是原子尺度上研究材料性能的重要基础，其准确性直接确定计算模拟结果。机器学习（ML）技术促进了原子间势的发展，既能保证第一原理方法的准确性，还能兼顾经验势的低成本和并行效率，但是机器学习原子间势（MLIAP）难以实现广泛的可迁移性，不能在与训练期间使用的不同组态中提供一致的准确性能。

为了平衡原子间势的准确性和可迁移性，来自美国桑迪亚国家实验室的David Montes de Oca Zapiain等研究者，开发了一种可扩展的、与用户无关且数据驱动的熵最大化（EM）协议。他们基于描述符分布的熵优化概念，对训练集生成过程进行改进，以完全自动化的方式生成非常大的（>2*10⁵ 组态，>7 *10⁶ 原子环境）和不同的钨数据集。然后，在熵优化数据集上训练多个多项式原子间势和神经网络原子间势，在领域内专业知识（DE）的钨数据集上训练一组相应的原子间势。与 DE 训练的模型相比，EM 训练的模型能够一致且准确地捕获大量训练集，以牺牲相对较小的准确性为代价，来获得极为强大的可迁移性，从而有效平衡原子间势的准确性和可迁移性，以避免外推陷阱。基于 EM 的方法本质上是可扩展的、完全自动化的、不依赖于人工输入的。因此，通过生成非常大的多样化设计训练集，可以得到准确且可迁移的MLIAP。与此同时，在传统 DE 数据集上训练的原子间势，在与训练集中包含相似组态时虽然准确性稍高，但在对样本外组态进行评估时，模型的准确性显著下降。该工作提出的这种自动产生多样化训练集的协议，可为数据稀疏的机器学习在表征生成模型的准确性方面提供指导。

该文近期发表于npj Computational Materials 8：189(2022)，英文标题与摘要如下，点击左下角“阅读原文”可以自由获取论文PDF。

Training data selection for accuracy and transferability of interatomic potentials

David Montes de Oca Zapiain, Mitchell A. Wood, Nicholas Lubbers, Carlos Z. Pereyra, Aidan P. Thompson & Danny Perez

Advances in machine learning (ML) have enabled the development of interatomic potentials that promise the accuracy of first principles methods and the low-cost, parallel efficiency of empirical potentials. However, ML-based potentials struggle to achieve transferability, i.e., provide consistent accuracy across configurations that differ from those used during training. In order to realize the promise of ML-based potentials, systematic and scalable approaches to generate diverse training sets need to be developed. This work creates a diverse training set for tungsten in an automated manner using an entropy optimization approach. Subsequently, multiple polynomial and neural network potentials are trained on the entropy-optimized dataset. A corresponding set of potentials are trained on an expert-curated dataset for tungsten for comparison. The models trained to the entropy-optimized data exhibited superior transferability compared to the expert-curated models. Furthermore, the models trained to the expert-curated set exhibited a significant decrease in performance when evaluated on out-of-sample configurations.

扩展阅读

Npj Comput. Mater.: 优质金属零件的打印—人类智能增强的机器学习

Npj Comput. Mater.: 稳定半导体材料的发现—深度学习-DFT

Npj Comput. Mater.: 带电膜的力学和电化学—新理论、新诠释

Npj Comput. Mater.: 沸石—大规模精确模拟

本文系网易新闻·网易号“各有态度”特色内容

媒体转载联系授权请看下方

微信扫码关注该文公众号作者

戳这里提交新闻线索和高质量文章给我们。

来源: qq

点击查看作者最近其他文章