Page 1 of 1

Different total energy when running parallel version

Posted: Tue Dec 05, 2006 9:17 am
by slaubach
I am not sure if should have posted my question in the Installation forum, but as the program is running only with a strange result I try it here.

I am finally able to run the parallel version of vasp yipeee,
but on my first tests doing a simple MgO calculation I found different total energies running the parallel version and the normal version.

My INCAR:

PREC = Accurate
ENCUT = 400.000
IALGO = 48
NELM = 60
NELMIN = 2
EDIFF = 1.0e-04
EDIFFG = -0.02
VOSKOWN = 1
NBLOCK = 1
ISPIN = 1
INIWAV = 1
ISTART = 0
ICHARG = 2
LWAVE = .FALSE.
LCHARG = .TRUE.
ADDGRID = .FALSE.
ISMEAR = -4
SIGMA = 0.2
LREAL = .FALSE.
RWIGS = 1.36 0.73
NEDOS = 1500
NPAR = 4

I als did the parallel calculation without the NPAR parameter with the same result.

The normal versions gives TOTEN: -11.100358 eV
The parallel version with one node: -9.857838 eV
With any other number of nodes: -9.722229 eV

What went wrong?
I would be really grateful for any help provided.

Different total energy when running parallel version

Posted: Tue Dec 05, 2006 2:27 pm
by admin
Most probably, NBANDS differed in both runs:
please note that the number of bands (including empty bands!) which are used for the electronic scf procedure is always an integer multiple of the number of nodes that you use in your parallel run.
A few empty bands however are needed to get the necessary varaitional freedom in the electronic scf-procedure (please have a look at the chapter Theoretical Background: Algorithms used in VASP to calculate the electronic groundstate for further information.)
---> set NBANDS in the serial run to the same value as was used in the parallel run. You can find out by grepping for NBANDS in OUTCAR. (this should also always be done if parallel runs are done on different #CPUs), then the results should be equal.

Different total energy when running parallel version

Posted: Tue Dec 05, 2006 3:05 pm
by slaubach
First of all I have to tell you how much I appreciate the fast help one can get in this forum.
Thank you.

Sadly for my question NBANDS seems not to be the answer.
On my serial run I had NBANDS=11, on the parallel run NBANDS = 12 with more than 1 node but also NBANDS=11 when running the parallel version on 1 node.
So this can not be the reason for the different values for the total energy for the serial calcution and the parallel version running on 1 node.

But to be sure I restarted the serial run with NBANDS=12.
Here I get TOTEN = -10.537870 eV which is still a big difference to the -9.722229 from the parallel run.

So I am still lost.

Different total energy when running parallel version

Posted: Thu Dec 07, 2006 9:56 am
by tjf
In case my email got killed by a spam filter: You should post a POSCAR (ar at least your source for the structure) and I'll see what my builds produce.

Different total energy when running parallel version

Posted: Tue Dec 12, 2006 2:01 pm
by admin
please check if everything (apart from the number of processors and serial/parallel executables) is consistent: this includes
--) all input parameters (of course)
--) the code version (of course)
--) the compilers and libraries (this includes not only the compiler itself, but also the degree of optimization and other compilation parameters you use).
--) possibly, the processor types (very seldom, processors may be defective themselves, producing slightly deviating results)