the molecular bond broken and deformation after replicating the cell via ML_FF run

Queries about input and output files, running specific calculations, etc.


Moderators: Global Moderator, Moderator

Message
Author
suojiang_zhang1
Newbie
Newbie
Posts: 43
Joined: Tue Nov 19, 2019 4:15 am

the molecular bond broken and deformation after replicating the cell via ML_FF run

#1 Post by suojiang_zhang1 » Mon Jul 17, 2023 3:09 am

hi,
the final run has always bothered me.
I thought the ML_FF is very accurate to decribe my system after near 200ps train, the energy and force ERR are low to satisfy the suggested values. After refit the ML_FF, thus I can use the ML_FF to run MD in a larger sytem and longer time simulation.
I replicated my box by 2 in xyz directions. but the result is bad, some molecules occur to bond broken and deformation.
I I followed the trajectory, and found that the case possibly happen at the interface between two cells, but I completely confirmed.
I try to build a larger cell directly, and ran the system using the ML_FF, and I found the similar case to happen.
I hope that your developers can give me suggestions to solve the problem.

jonathan_lahnsteiner2
Global Moderator
Global Moderator
Posts: 215
Joined: Fri Jul 01, 2022 2:17 pm

Re: the molecular bond broken and deformation after replicating the cell via ML_FF run

#2 Post by jonathan_lahnsteiner2 » Mon Jul 17, 2023 5:20 am

Dear suojiang_zhang1,

Could you please send the input files as stated in the vasp forum guidelines:
forum/viewtopic.php?f=4&t=17928
Could you please additionally send the ML_FF file you are using.
It would be very helpful if you could also send your original POSCAR and the POSCAR you obtained
by replication in xyz direction. As already discussed
in your last post about broken bonds in ML_FF simulations I am not able to help without
input files.

All the best Jonathan

suojiang_zhang1
Newbie
Newbie
Posts: 43
Joined: Tue Nov 19, 2019 4:15 am

Re: the molecular bond broken and deformation after replicating the cell via ML_FF run

#3 Post by suojiang_zhang1 » Mon Jul 17, 2023 5:29 am

the ML_FF is larger than 800Mb, so I am not able to upload it.

suojiang_zhang1
Newbie
Newbie
Posts: 43
Joined: Tue Nov 19, 2019 4:15 am

Re: the molecular bond broken and deformation after replicating the cell via ML_FF run

#4 Post by suojiang_zhang1 » Mon Jul 17, 2023 1:46 pm

How I do it? How to upload the ML_FF that is larger than 800mb

please give me suggestions.

jonathan_lahnsteiner2
Global Moderator
Global Moderator
Posts: 215
Joined: Fri Jul 01, 2022 2:17 pm

Re: the molecular bond broken and deformation after replicating the cell via ML_FF run

#5 Post by jonathan_lahnsteiner2 » Mon Jul 17, 2023 2:28 pm

Dear Suojiang Zhang,

If the ML_FF file is too large could you try uploading the input files and the OUTCAR file.
Maybe this is already sufficient to analyze your problem. You could also put the ML_FF file
into some online storage and give me short time access such that I can download it. Maybe
dropbox or something similar. Without files I am not able to really help.

Was the ML_FF trained at the same conditions as you run the calculation at. As temperature, pressure...
ML_FF force fields tend to show bond breaking when used for extrapolation.

All the best Jonathan

suojiang_zhang1
Newbie
Newbie
Posts: 43
Joined: Tue Nov 19, 2019 4:15 am

Re: the molecular bond broken and deformation after replicating the cell via ML_FF run

#6 Post by suojiang_zhang1 » Thu Jul 20, 2023 1:30 am

Dear
thank for your advices
you can find and download the ML_FF files from our server by sftp mark.meng@159.226.63.134 with password meng@1234.
please pay a attention, the files only can be available two days, please download them as possible.

Yours.

jonathan_lahnsteiner2
Global Moderator
Global Moderator
Posts: 215
Joined: Fri Jul 01, 2022 2:17 pm

Re: the molecular bond broken and deformation after replicating the cell via ML_FF run

#7 Post by jonathan_lahnsteiner2 » Thu Jul 20, 2023 9:01 am

Dear suojiang_zhang1,

We don't to use this kind of page. Please upload your files according to the vasp forum
guidelines:
forum/viewtopic.php?f=4&t=17928
Please upload your ML_LOGFILE and the OUTCAR file of the production run where the bonds break. Upload those files according to the vasp forum guidelines.

All the best Jonathan

suojiang_zhang1
Newbie
Newbie
Posts: 43
Joined: Tue Nov 19, 2019 4:15 am

Re: the molecular bond broken and deformation after replicating the cell via ML_FF run

#8 Post by suojiang_zhang1 » Thu Aug 10, 2023 8:37 am

Hi,
My questions have not been solved and issued correctly.
My files ML_FF ML_LOGFILE OUTCAR each exceeds the size that allowed to upload, thus the questions can not be issued very well.
please give me a way to solve the difficulty.

yours.

jonathan_lahnsteiner2
Global Moderator
Global Moderator
Posts: 215
Joined: Fri Jul 01, 2022 2:17 pm

Re: the molecular bond broken and deformation after replicating the cell via ML_FF run

#9 Post by jonathan_lahnsteiner2 » Thu Aug 10, 2023 9:50 am

Dear suojiang_zhang1,

Without any information about your calculation I am not able to help you.
If also the OUTCAR file and the ML_LOGFILE are too large to upload to the vasp forum
you could try to upload only the last, let's say 100 MD steps of the OUTCAR and ML_LOGIFILE.
Also your input files as INCAR and POSCAR could give some insight into the problem.

What could also be a reason for bond breaking is a too large time step. We could verify this form your
POSCAR and INCAR file.
As asked already, it would be helpful if you would send POSCAR before and after replication with your script?



All the best Jonathan

suojiang_zhang1
Newbie
Newbie
Posts: 43
Joined: Tue Nov 19, 2019 4:15 am

Re: the molecular bond broken and deformation after replicating the cell via ML_FF run

#10 Post by suojiang_zhang1 » Fri Aug 11, 2023 8:01 am

Thank you for your reply.
I uploaded the files,
You do not have the required permissions to view the files attached to this post.

jonathan_lahnsteiner2
Global Moderator
Global Moderator
Posts: 215
Joined: Fri Jul 01, 2022 2:17 pm

Re: the molecular bond broken and deformation after replicating the cell via ML_FF run

#11 Post by jonathan_lahnsteiner2 » Fri Aug 11, 2023 8:55 am

Dear suojiang_zhang1,

I checked your input files now. What I saw is that you have a system with hydrogen atoms. Hydrogen atoms are very light
and therefore they tend to fly off during molecular dynamics simulations. The frequency of breaking hydrogen bonds increases with the number of hydrogen bonds in your system. Therefore you only recognized it during the production run. Also during production
run is that your atoms will enter regions of the phase space which are not known by the force field yet.

Therefore, I would recommend to decrease your time step to about POTIM=0.5.
Another thing you could try is to increase the hydrogen mass in the POTCAR file to about 8.0 au.
You could do this by changing the line

Code: Select all

POMASS =    1.000; ZVAL   =    1.000    mass and valenz
in your POTCAR file to

Code: Select all

POMASS =    8.000; ZVAL   =    1.000    mass and valenz
Then you should be able to work with your current time step.
But since your are dealing in general with rather light atoms, I would recommend to decrease your time step to

Code: Select all

POTIM=0.5
If the bond breaking still occurs you could decease the time step further since your temperatures are rather high.

Another important thing, I recognized in the INCAR file of your training run is that you are setting ML_MB=1600.
This will be definitely a too low value for the system you are considering.
I would recommend to increase this value to

Code: Select all

ML_MB=5000

in the INCAR file of your training. You don't have to start training from scratch, but you can just continue training by copying the ML_ABN to ML_AB and setting ML_AB=5000 in the INCAR.
And last thing I would recommend you to do, is to train up to a higher temperature. Maybe TEND=600K This will sample more, out of equilibrium structures and help you, to not get broken bonds anymore.

I hope this is of help to you.

All the best Jonathan

suojiang_zhang1
Newbie
Newbie
Posts: 43
Joined: Tue Nov 19, 2019 4:15 am

Re: the molecular bond broken and deformation after replicating the cell via ML_FF run

#12 Post by suojiang_zhang1 » Fri Aug 11, 2023 1:40 pm

Hi,
thank you so much for your kind recommandation.
Also in my whole run process including training and production run, I set the mass of hydrogen is 8.0 in my POTCAR file.
So it is not problem for hydrogen bond.
I checked my ml_logfile for error evaluation, I found that the energy, force errors are very converged and low. so I think the ML_FF is good for my system. in addition, how I know the system enter regions of the phase space which are not known by the force field yet

I can reduce my timestep less than 1.0fs or is 0.5fs to try it.
If the bond-breaking is not occur in the process, I wonder that I will use the timestep still in following production run, but it is so slow.

Also I can retrain my system using a larger ML_AB of 5000 at a wider temperature range of 200-600.
Yours.

jonathan_lahnsteiner2
Global Moderator
Global Moderator
Posts: 215
Joined: Fri Jul 01, 2022 2:17 pm

Re: the molecular bond broken and deformation after replicating the cell via ML_FF run

#13 Post by jonathan_lahnsteiner2 » Mon Aug 14, 2023 6:15 am

Dear suojiang_zhang1,

You do not know a priori if you enter regions of the phase space which were not yet explored by the force field.
But an indicator might be that bonds are breaking during your production runs.

Another thing. You should not rely on the errors written to the ML_LOGFILE to estimate the accuracy of your force field.
Those contain errors you obtained during fitting. To evaluate the force fields accuracy you have to generate an independent test set.
This is a general procedure when using machine learning approaches.
The test-set may contain 500 structures in your desired temperature range. From those structures you have to compute DFT energies, forces and stress tensor. The DFT settings have to be exactly the same as during training. Then compute the energies, forces and stresses of these structures with your machine learning force fields. Like this you obtain two data sets, which you can store in vaspout.h5 files. One for the DFT calculations and one for MLFF calculations.
You can analyze the errors by computing the root mean square errors between the DFT and MLFF data sets, for energy, force and stress seperately.
Also py4vasp gives you the option to do an error analysis. If you have py4vasp installed you can run the following script:

Code: Select all

error-analysis -dft [ list of all DFT vaspout.h5 files ] -ml [ list of all MLFF vaspout.h5 files in same order]
This will give you the errors between DFT and MLFF results.

But as already mentioned in my last post first you should set ML_MB to a higher value. It is in my experience really low for a system containing hydrogen. Then pick up more local reference structures and then continue with an test set error analysis of your force field.
An example of a test set analysis can be found here https://www.nature.com/articles/s41524-021-00630-5
After doing so, I don't think it is necessary anymore to go to a smaller time step, if you set POMASS of hydrogen to 8.


All the best Jonathan

suojiang_zhang1
Newbie
Newbie
Posts: 43
Joined: Tue Nov 19, 2019 4:15 am

Re: the molecular bond broken and deformation after replicating the cell via ML_FF run

#14 Post by suojiang_zhang1 » Thu Aug 24, 2023 1:53 am

Dear,
I did the train following your suggestions by increasing the ML_AB to 5000 and using a POTIM=0.5, but I found the bond breaking is still occuring in a long production run.
Moreover, I found the bond breaking is not occurring invloving in the H atoms, in my system the serious problem occurs in the B-F bond.
you can see the CONTCAR and the snapshot about the ruined system.
You do not have the required permissions to view the files attached to this post.

jonathan_lahnsteiner2
Global Moderator
Global Moderator
Posts: 215
Joined: Fri Jul 01, 2022 2:17 pm

Re: the molecular bond broken and deformation after replicating the cell via ML_FF run

#15 Post by jonathan_lahnsteiner2 » Fri Aug 25, 2023 12:10 pm

Dear suojiang_zhang1,

How many reference configurations do you have for H B C N F? This is written in the ML_AB file under the line
The numbers of basis sets per atom type
And do you make your production run at the same conditions as the training run, at the same temperature and pressure?
Extrapolating to other temperatures and pressures can cause troubles, especially when going to
higher temperatures.
Please supply me with the needed information.

All the best Jonathan

Locked