Page 1 of 2
MLFF for bulk liquids
Posted: Fri May 20, 2022 6:44 am
by suojiang_zhang1
I was going to use MLFF for a simulation of a liquid, and my system was an ionic compound, completely composed of cations and anions, like water.
I initialized an a=b=c BOX, including 240 atoms and B C F H N elements, the initial box is larger, so the density is smaller than experimental density, so I want to run NPT simulation to reforce the density to increase the experimental value,
the following is my INCAR file:
#Basic parameters
ISYM=0
PREC=Normal
ISMEAR = 0
SIGMA = 0.1
LREAL = Auto
ISYM = -1
NELM = 100
EDIFF = 1E-5
LWAVE = .FALSE.
LCHARG = .FALSE.
ECUT=600
#Parallelization of ab initio calculations
NCORE = 2
IVDW=11
#MD
IBRION = 0
MDALGO = 3
LANGEVIN_GAMMA = 10 10 10 10 10
LANGEVIN_GAMMA_L = 10
PMASS=10
ISIF = 3
PSTRESS=0.001
SMASS = 1.0
TEBEG = 500
#TEEND=350
NSW = 2000
POTIM = 1.0
RANDOM_SEED = 88951986 0 0
#Machine learning paramters
ML_LMLFF = .T.
ML_ISTART = 0
The press is 1 bar, and temperature is 300K.
I saw that the temperature is sharply increased to more than 4000K, that cause unstable, even to break the bonds.
I also often was reminded MLFF: not enough storage for local configurations, please increase ML_MB.
I do not know to solve the problem or tune the parameters, please give me some advices.
in addition I ran a nvt ensumble simulation, and got a MLFF, I want to apple a larger system with extended box in XYZ directions. is it feasible? it is also same to NPT simulation.
Re: MLFF for bulk liquids
Posted: Fri May 20, 2022 7:46 am
by andreas.singraber
Hello!
Welcome to the VASP forum! Please have a look at the
forum posting guidelines and post the in- and output of your run. Most importantly, we are interested in the
ML_LOGFILE and the
OUTCAR file. If it is very large, please try to submit it in a zipped archive.
Since you are running a liquid in the NpT ensemble, did you use an
ICONST file to prevent box tilting?
All the best,
Andreas Singraber
Re: MLFF for bulk liquids
Posted: Fri May 20, 2022 8:56 am
by suojiang_zhang1
Thank you
I can post my OUTCAR, ML_LOGFILE, POSCAR and INCAR by the zipped file.
I constrained the angles of the lattice by ICONST:
LA 1 2 0
LA 1 3 0
LA 2 3 0
~
Re: MLFF for bulk liquids
Posted: Fri May 20, 2022 2:06 pm
by suojiang_zhang1
Hi andreas singraber,
Could you see my output files as you mentioned.
not enough ML_MB in ML_FF
Posted: Sun May 22, 2022 4:19 am
by suojiang_zhang1
Hi
In my training for FF, I often reminded "the ML_MB is not enough, increase the ML_MB", especially ML_ISTART=1 to continue run the train. I can set the ML_MB to a larger number, such as 3000 or higher, but "Total memory consumption" in ML_LOGFILE become much more, maybe exceed the memory of my computer, and the calculation has to stop.
How I run my training with the larger ML_MB and the memory is enough for the ML_MB, of course I don’t care the time of training, which is possible longer.
Re: MLFF for bulk liquids
Posted: Mon May 30, 2022 12:59 pm
by ferenc_karsai
Liquids are generally very tough to learn, because they can have too many different configurations. This is reflected in a huge number of required local reference configurations (ML_MB) for a good accuracy.
Also the algorithm scales quadratically with the number of element species, which in your case with 5 species will be quite demanding.
I can't give you a general recipe what to do because for us it is also an ongoing investigation how to deal with complicated liquids, but here are some things you can try if you are satisfied with lower accuracy:
Reduce the cutoffs: ML_RCUT1=4.0, ML_RCUT2=4.0
Reduce the angular cutoff for the three-body descriptor: ML_LMAX2=3 or even 2.
Reduce the number of radial basis functions: ML_MRB1=6, ML_MRB2=6
Increase the maximumb allowed size for ML_MB as much as possible (you can see the estimation for the required memory per core in the beginning of the ML_LOGFILE - this information is written out before allocations).
If you see strong inbalances in the number of local reference configurations among the species (grep LCONF ML_LOGFILE) you can set the following: ML_LBASIS_DISCARD=.TRUE. - this will not collect more reference configurations for a species if it reaches ML_MB but rather throw away the old ones. In your case you have hydrogen as a species, I'm pretty sure hydrogen will need more reference configurations than the others, possibly too much.
Very important, all these things need to be monitored: grep ERR ML_LOGFILE
Depending on what you want to calculate your errors of the training data should be in the desired range (usually a few meV/atom for the energies and below 100 meV/Angstrom for the forces).
After your force field has the desired accuracy on the training data you need to verify the accuracy on a test data. The test data should be possibly picked from trajectories other than the training data but on the same structure types and conditions.
Re: MLFF for bulk liquids
Posted: Tue May 31, 2022 1:19 am
by suojiang_zhang1
Hi, Ferenc,
thank you for your kind reply.
At present in my case I can train ML_FF smoothly when I ran in other cluster, and I think the learning strongly depend on the computer and compiler.
For liquid, the ensemble I trained is Npt, not Nvt. But the Npt need more parameters, as shown in my previous post.
A question is the external press is controlled by PSTRESS? for 0.001, is it 1bar?
but grep "press" OUTCAR, several different pressure: external pressure, total pressure, Pullay stress, kinetic pressure.
In addition, what influence on pressure from the langevin_GAMMA and Langevin_GAMMA_L and PMASS?
Re: MLFF for bulk liquids
Posted: Wed Jun 01, 2022 6:49 am
by suojiang_zhang1
picking up the above post.
In practice, I tested a MD with ML_ISTART=2 using a ML_FF. The ML_FF is with good accuracy with rmse_force=0.08.
The tested system is initial POSCAR to test the ML_FF accuracy. but I found the result is not good. with the box changing, some molecules also occur deformation, which is not what I want.
Re: MLFF for bulk liquids
Posted: Thu Jun 09, 2022 1:47 pm
by ferenc_karsai
The description of PSTRESS can be found here:
https://www.vasp.at/wiki/index.php/PSTRESS
The unit is in kB.
The external pressure is the trace of the stress tensor minus a correction from the volume deformation.
This is what you desire as pressure of the system. In a well equilibrated calculation the values of the external pressure should oscillate around the values of the target pressure PSTRESS.
Your ensemble results must be independent from langevin_GAMMA, Langevin_GAMMA_L and PMASS. This should be the case as long as you don't choose too small or too large values (the values in your calculations should be fine).
Concerning the deformation of your testrun:
Did you also use the ICONST run for the testrun? Of course without that file the change of the box would occur as in the training runs.
Re: MLFF for bulk liquids
Posted: Fri Jun 10, 2022 1:41 am
by suojiang_zhang1
Hi, Ferenc
Thank you so much.
I can understand your message about the pressure.
I used the ICONST to contrain the angle and radio change of length. I saw the molecular deformation, involving in some bond angles or bond lengths.
Under condition of NPT ensemble with temp=300K and pressure=1Bar (PSTRESS=0.001), initially the cubic box in POSCAR is larger than the expermental that.
The POMASS=8 and POTIM=1.0of hydrogen for avoiding the hydrogen to be unstable.
In addition, after NSW=50000 I saw that the length of the lattice is changed from 14.0 to 11.72, not the desired length of 10.0 of experimental value.
grep "ERR" ML_LOGFILE, the energy is 4.77e-3 and force is 7.28e-2, the ML_FF should is good.
ML_FF for ionic liquid with lower density at NPT ensemble
Posted: Fri Jul 01, 2022 3:19 am
by suojiang_zhang1
Hi,
many times I trained the ML_FF of liquid, acutally ionic liquid by vasp 6.3.0 at NPT ensemble, but I found the density is smaller than the experimental one, namely the obtained box (11.6 angstrom) is larger than the desired box sizes (10 angstrom). I checked out the ML_LOGFILE by grep ERR, the rmse force is around 9.0E-02, I think the accurancy is ok.
My INCAR is :
#Basic parameters
ISYM=0
ISTART=0
ICHARG=2
ALGO=Fast
ECUT=500
EDIFF=1E-5
PREC=Accurate
ISMEAR = 0
SIGMA = 0.1
LREAL = Auto
NELM = 100
#VDW parameters
IVDW=11
#control output
LWAVE=.F.
LCHARG=.F.
#MD
IBRION = 0
MDALGO = 3
ISIF=3
LANGEVIN_GAMMA = 10 10 10 10 10
LANGEVIN_GAMMA_L = 5
PMASS=100
PSTRESS=0.002
SMASS = 1.0
TEBEG = 300
TEEND=400
NSW = 10000
POTIM = 1.0
RANDOM_SEED = 88951986 0 0
#Machine learning paramters
ML_LMLFF = .T.
ML_ISTART = 0
The ICONST is:
LA 1 2 0
LA 1 3 0
LA 2 3 0
LR 1 0
LR 2 0
LR 3 0
S 1 0 0 0 0 0 0
S 0 1 0 0 0 0 0
S 0 0 1 0 0 0 0
S 0 0 0 1 -1 0 0
S 0 0 0 1 0 -1 0
S 0 0 0 0 1 -1 0
similar questions I posted previously
please give me some advice.
Re: MLFF for bulk liquids
Posted: Mon Jul 04, 2022 10:18 am
by ferenc_karsai
I merged the two topics.
I think the density of your system first and foremost depends on the treatment of exchange correlation you have chosen.
First you should check literature and see what functional is recommended for your system.
If there is no hint in literature what exchange-corrlation functionals reproduce well the structural properties of your system, then you are left with trial and error of different exchange-corrlation functionals.
Re: MLFF for bulk liquids
Posted: Mon Jul 04, 2022 12:16 pm
by suojiang_zhang1
Dear Ferenc,
I can confirm the PBE correlation-exchange functional was used for the opt. in many published literature, as I used it in my calculation.
Afterward, I used RP functional, but the density still is smaller than experimental.
So I think the density is not strongly depend on the CX
Re: MLFF for bulk liquids
Posted: Thu Jul 07, 2022 12:18 pm
by ferenc_karsai
Another thing is how you obtain the density.
For a molecular dynamics (MD) calculation you need to sufficiently sample the number of MD steps over which you average the density.
You can check the convergence of the density with respect to number of MD steps via block averages.
Re: MLFF for bulk liquids
Posted: Thu Jul 07, 2022 12:20 pm
by ferenc_karsai
You should also check the bayesian error estimations (BEEF) in your production calculations.
If the Bayesian error goes up frequently it means that albeit the error on training data is low, the training structures are not covering your structures for the production calculations.