sparsifying MLFF structures
Posted: Wed Jan 24, 2024 1:03 pm
Hello,
I have two questions:
1. In literature some use on-the-fly training to equilibrate the structure f.e at 500 K to feed in the CONTCAR from this run as a POSCAR for the actual training at 500 K. I assume this is to keep forcefield light by showing it the most probable structures for the given training condition. How does omitting this step affect the quality of the obtained FF?
2. I assume having equilibration as part of the training increases number of the structures in the training set and expands # of local configurations beyond ones that would be most relevant for the system. Sparsification of local configurations with ML_EPS_LOW or ML_MB tag is partial fix for this issue. Is there something similar for the structures? To make it more clear, I am training a forcefield to study a lightly Al doped system. I have trained on the undoped structure at 3 temperatures for 45 ps in total and accumulated ~1200 structures in my training set. After continuing the training on the doped system at the highest training temperature (my POSCAR for this training was not equilibrated) I have 1800 structures. My energy errors are in order of 10 meV/atom, and force errors are between 0.5 and 1 eV/A. I want to continue the training, but I am afraid that having a lot of structures in the training set will later undermine the speed. Is there a way to remove the structures that contribute little to the accuracy to lighten FF before training it further?
Thank you very much!
Sona
I have two questions:
1. In literature some use on-the-fly training to equilibrate the structure f.e at 500 K to feed in the CONTCAR from this run as a POSCAR for the actual training at 500 K. I assume this is to keep forcefield light by showing it the most probable structures for the given training condition. How does omitting this step affect the quality of the obtained FF?
2. I assume having equilibration as part of the training increases number of the structures in the training set and expands # of local configurations beyond ones that would be most relevant for the system. Sparsification of local configurations with ML_EPS_LOW or ML_MB tag is partial fix for this issue. Is there something similar for the structures? To make it more clear, I am training a forcefield to study a lightly Al doped system. I have trained on the undoped structure at 3 temperatures for 45 ps in total and accumulated ~1200 structures in my training set. After continuing the training on the doped system at the highest training temperature (my POSCAR for this training was not equilibrated) I have 1800 structures. My energy errors are in order of 10 meV/atom, and force errors are between 0.5 and 1 eV/A. I want to continue the training, but I am afraid that having a lot of structures in the training set will later undermine the speed. Is there a way to remove the structures that contribute little to the accuracy to lighten FF before training it further?
Thank you very much!
Sona