Page 1 of 1

Problem in running vasp with USPEX(a crystal structure predictions code)

Posted: Mon Sep 16, 2013 9:43 am
by dx0620
Here are the error informations :

-----------------------------------------------------------------------------
forrtl: error (78): process killed (SIGTERM)
Image PC Routine Line Source
libmkl_avx.so 00002B7258D44524 Unknown Unknown Unknown
forrtl: error (78): process killed (SIGTERM)
Image PC Routine Line Source
vasp 000000000066A3DF Unknown Unknown Unknown
vasp 0000000000670B88 Unknown Unknown Unknown
vasp 0000000000D407B4 Unknown Unknown Unknown
vasp 00000000004682E4 Unknown Unknown Unknown
vasp 000000000043F07C Unknown Unknown Unknown
libc.so.6 0000003469A1ECDD Unknown Unknown Unknown
vasp 000000000043EF79 Unknown Unknown


--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpirun noticed that process rank 1 with PID 30349 on node cluster.ysu.org exited on signal 9 (Killed).
--------------------------------------------------------------------------
mpirun -np 8 vasp >log: Killed
Structure5 step4 at CalcFold1
fingerprint CalcTime = 0.069814 sec

[cluster.ysu.org][[43129,1],4][btl_tcp_frag.c:215:mca_btl_tcp_frag_recv] [cluster.ysu.org][[43129,1],7][btl_tcp_frag.c:215:mca_btl_tcp_frag_recv] mca_btl_tcp_frag_recv: readv failed: Connection reset by peer (104)
[cluster.ysu.org][[43129,1],2][btl_tcp_frag.c:215:mca_btl_tcp_frag_recv] mca_btl_tcp_frag_recv: readv failed: Connection reset by peer (104)
mca_btl_tcp_frag_recv: readv failed: Connection reset by peer (104)
[cluster.ysu.org][[43129,1],5][btl_tcp_frag.c:215:mca_btl_tcp_frag_recv] mca_btl_tcp_frag_recv: readv failed: Connection reset by peer (104)
[cluster.ysu.org][[43129,1],1][btl_tcp_frag.c:215:mca_btl_tcp_frag_recv] mca_btl_tcp_frag_recv: readv failed: Connection reset by peer (104)
[cluster.ysu.org][[43129,1],0][btl_tcp_frag.c:215:mca_btl_tcp_frag_recv] mca_btl_tcp_frag_recv: readv failed: Connection reset by peer (104)
[cluster.ysu.org][[43129,1],3][btl_tcp_frag.c:215:mca_btl_tcp_frag_recv] mca_btl_tcp_frag_recv: readv failed: Connection reset by peer (104)
forrtl: error (78): process killed (SIGTERM)
Image PC Routine Line Source
libc.so.6 0000003469AE7C53 Unknown Unknown Unknown
libmpi.so.1 00002B74FCA4BE0B Unknown Unknown Unknown
libmpi.so.1 00002B74FCA4DF7A Unknown Unknown Unknown
libmpi.so.1 00002B74FCA72E39 Unknown Unknown Unknown
libmpi.so.1 00002B74FC8CD3B5 Unknown Unknown Unknown
libmpi.so.1 00002B74FC8F1E17 Unknown Unknown Unknown
libmpi_f77.so.1 00002B74FC636D7E Unknown Unknown Unknown
vasp 0000000000494712 Unknown Unknown Unknown
vasp 00000000005EAE5F Unknown Unknown


--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpirun noticed that process rank 6 with PID 30477 on node cluster.ysu.org exited on signal 9 (Killed).
--------------------------------------------------------------------------
mpirun -np 8 vasp >log: Killed
Structure6 step1 at CalcFold1
[cluster.ysu.org][[43088,1],0][btl_tcp_frag.c:215:mca_btl_tcp_frag_recv] mca_btl_tcp_frag_recv: readv failed: Connection reset by peer (104)
[cluster.ysu.org][[43088,1],6][btl_tcp_frag.c:215:mca_btl_tcp_frag_recv] mca_btl_tcp_frag_recv: readv failed: Connection reset by peer (104)
[cluster.ysu.org][[43088,1],5][btl_tcp_frag.c:215:mca_btl_tcp_frag_recv] mca_btl_tcp_frag_recv: readv failed: Connection reset by peer (104)
forrtl: error (78): process killed (SIGTERM)
Image PC Routine Line Source
libc.so.6 0000003469AE7C53 Unknown Unknown Unknown
libmpi.so.1 00002AFCBE7EAE0B Unknown Unknown


--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpirun noticed that process rank 2 with PID 30514 on node cluster.ysu.org exited on signal 9 (Killed).
--------------------------------------------------------------------------
mpirun -np 8 vasp >log: Killed
Structure6 step2 at CalcFold1
Structure6 step3 at CalcFold1
Structure6 step4 at CalcFold1
fingerprint CalcTime = 0.16146 sec

--------------------------------------------------------------------------
MPI_ABORT was invoked on rank 4 in communicator MPI_COMM_WORLD
with errorcode 1.

NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes.
You may or may not see output from other processes, depending on
exactly when Open MPI kills them.
--------------------------------------------------------------------------
--------------------------------------------------------------------------
mpirun has exited due to process rank 0 with PID 32724 on
node cluster.ysu.org exiting improperly. There are two reasons this could occur:

1. this process did not call "init" before exiting, but others in
the job did. This can cause a job to hang indefinitely while it waits
for all processes to call "init". By rule, if one process calls "init",
then ALL processes must call "init" prior to termination.

2. this process called "init", but exited without calling "finalize".
By rule, all processes that call "init" MUST call "finalize" prior to
exiting or it will be considered an "abnormal termination"

This may have caused other processes in the application to be
terminated by signals sent by mpirun (as reported here).
--------------------------------------------------------------------------
[cluster.ysu.org:32723] 7 more processes have sent help message help-mpi-api.txt / mpi-abort
[cluster.ysu.org:32723] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages


--------------------------------------------------------------------------
4-680.502264267-715.879302780 498.211054496-603.031961424-637.750335756-180.342665855-588.338007790************** 221.938377144
4-680.501167377-715.878148866 498.210251438-603.030989407-637.749307777-180.342375164-588.337059457************** 221.938019405
internal error in RAD_INT: RHOPS /= RHOAE
4-299.171882141-307.412222583 263.684996702-236.887230167-314.827284945 -74.117914498-295.281287502-493.829727976 85.192169357
4-299.171399911-307.411727070 263.684571673-236.886848333-314.826777480 -74.117795028-295.280811543-493.828931980 85.192032037
internal error in RAD_INT: RHOPS /= RHOAE


THANKS
<span class='smallblacktext'>[ Edited ]</span>

Re: Problem in running vasp with USPEX(a crystal structure predictions code)

Posted: Wed Sep 11, 2024 2:22 pm
by support_vasp

Hi,

We're sorry that we didn’t answer your question. This does not live up to the quality of support that we aim to provide. The team has since expanded. If we can still help with your problem, please ask again in a new post, linking to this one, and we will answer as quickly as possible.

Best wishes,

VASP