Page 1 of 1

Procedure to optimize MLFF training

Posted: Fri Feb 14, 2025 9:32 am
by paulfons

I have generated a MLFF for Ge-Te alloys with the intention of using to make melt-quenched amorphous samples. This involves heating the sample above the melting point and then rapidly cooling it (at a rate, for example of 15K/ps)

For my first try, I used the following temperature ramp (and a NPT ensemble). I will attach a typical INCAR file. A refit run was completed at the end of the training. I have attached the LCONF lines from the ML_LOGFILE corresponding to each run. I noticed that upon quenching GeTe6 from melt, I would encounter isolated Te atoms in the melt-quench material which is are unlikely to exist. What suggestions would you give to improve the training? Should I use longer runs? The training was done on a small set of 54 atoms. The densities (NPT) seemed in reasonable agreement with experiment. The experimental melting point is roughly at 1000 K and I am interested in Te-rich compositions so 1500K is well above the melting point. I intended to randomize the initial structure using a temperature slightly higher than the melting point. (Is it too high for training?)

For reference, I have also loaded plots of the err_beef_ctifor values vs MD step in the attachment.

1. 400-600K 2,000 steps LCONF 2000 Ge 423 429 Te 411 421
2. 600-800K 5,000 steps LCONF 5000 Ge 657 704 Te 624 668
3. 1000-1200K 5,000 steps LCONF 5000 Ge 798 805 Te 733 741
4. 1200-1400K 5,000 steps LCONF 5000 Ge 1106 1113 Te 1003 1013
5. 1400-1600K 5,000 steps LCONF 5000 Ge 1692 1724 Te 1480 1506
6. 1500-1500K 10,000 steps LCONF 10000 Ge 4701 4753 Te 4440 4474
7. 1500-500K 10,000 steps LCONF 3687 Ge 8000 8027 Te 5672 5696

ALGO = Fast
BMIX = 1
ENMAX = 400
EDIFF = 1E-6
IBRION = 0
ISIF = 3
ISMEAR = 0
ISPIN = 1
ISYM = 0
KBLOCK = 100
LASPH = True
LCHARG = False
LMAXMIX = 4
LORBIT = 11
LPLANE = False
LREAL = False
LSCALU = False
LWAVE = True
NBLOCK = 1
NELM = 500
NELMIN = 4
NSW = 2000
POTIM = 2.0
PREC = Normal
SIGMA = 0.02

IVDW = 12

MDALGO = 3 ! Langevin thermostat
LANGEVIN_GAMMA = 10 10 ! friction
LANGEVIN_GAMMA_L = 10 ! lattice friction
PMASS = 10 ! lattice mass
TEBEG = 400 ! temperature
TEEND = 600

KPAR = 8
NCORE = 4

! machine learning
ML_MODE = train
ML_LMLFF = T
ML_ISTART = 0
ML_WTSIF = 2


Re: Procedure to optimize MLFF training

Posted: Fri Feb 14, 2025 10:39 am
by ferenc_karsai

I've moved your post here to "From users to users".

I've looked at your settings they look quite reasonable. I don't know how the training errors are because you didn't post your ML_LOGFILE of the last training step. Also I didn't see how many k-points you use but the KPAR tag suggest you use more than one.

Possible source of errors and what you could try:
-) Did you use a constraining of the lattice angles (see https://www.vasp.at/wiki/index.php/ICONST) when you train in the melt phase. Rodlike deformations are very common for liquids. This could have sneaked in in one of the runs and could have affected your calculations.
-) PBE-D3 (IVDW=11, 12) can sometimes lead to inconsistencies in the energies vs volume, because the coordination number can suddenly jump between volumes. So it's worth trying a different van der Waals like D2 (IVDW=10).
-) As you mentioned you can try longer runs (which means slower heating).
-) Try ALGO=Normal if you see any hints that the electronic convergence has problems (best look in the OSZICAR if at any step the maximum is reached - so possibly not converged).
-) At last try maybe increasing calculational parameters such as number of k-points, ENCUT, etc.

I notice two things:
-) You use ML_ISTART=0. That tag is deprecated, please use ML_MODE=train.
-) You use ML_WTSIF=2. If you for a special reason need to have the stress tensor more accurate and are not interested in the energies you can use that. Otherwise I think forces and energies are more important so I would not use that tag, since if you weight any of energy, forces or stress stronger it will make the others less accurate.


Re: Procedure to optimize MLFF training

Posted: Thu Feb 27, 2025 4:17 am
by paulfons

Dear Ferenc,
Thank you for your reply. To answer a few of your questions, I did use 8 k-points for the training. The ML_WTSIF=2 was specified in one of the tutorials on the vasp wiki so I used it believing it to be a hint. I will drop the use of this tag. I also was using IVDW 12, but I appreciate your comment about a sudden change in coordination could cause an energy jump due to the DFT-D3 term. I will try training again using the DFT-D2 correction as you suggest. I can also try ramping the temperature more slowly, but of course, this will take more time. I also did use the ICONST criteria to avoid cell deformation into a strange shape by constraining alpha,beta, gamm and two of the three axes lengths. I relaxed this constraint when the system temperature was below the melting point. The cell shape was reasonable throughout.

I was curious if you had any advice regarding the "ideal" or a reasonable number for LCONF. Also is there any sort of advice as to what the BEEF, ERR, and CTIFOR relationship should look like for a successfully trained system. I have attached the last plot of ERR, BEEF, and CTIFOR and the ML_LOGFILE for the last training.

One additional question is that as I aiming for a melt-quenched system, should I also train while reducing the temperature from the melt?

Thanks for any advice you can offer.


Re: Procedure to optimize MLFF training

Posted: Fri Mar 14, 2025 12:29 pm
by ferenc_karsai

Yes you can also train by reducing the temperature but only after you already have a somewhat decent force field.