Page 1 of 1

VASP 4.6.31 Parallel : Calculation hangs at 'parallel 3dFFT wavefunction'

Posted: Wed Jul 02, 2008 8:01 pm
by sgowtham
Dear VASP Master & Fellow Users:

Our research group recently acquired Intel Compiler Tool Kit Cluster Edition 3.1.1 (provides ICC/IFORT 10.1, CMKL 10.0.3) and I was successfully compile the parallel version with the following Makefile. However, when I try running with 2 (or 2^n) processors, the OUTCAR file stops/hangs at

'parallel 3dFFT wavefunction:'

step (even before the first iteration) and the outfile has this error/warning message:

symbol lookup error: /home/local/bin/vasp_4631p_ictce311_1: undef
ined symbol: __svml_trunc2

VASP 4.6.31 with Intel 9.1 compilers are working just fine.

Makefile is enclosed below and any help to solve this issue would be greatly appreciated.

Regards,
gowtham

* * * * * * * * * * * * * * * * * * * * * * * * * * * *

Code: Select all

.SUFFIXES: .inc .f .f90 .F

#
# All CPP processed fortran files have the extension .f90
SUFFIX=.f90

#
# FORTRAN Compiler and Linker
# FC=/home/local/mpich/1.2.7p1/intel/ctce/3.1.1.002/em64t/bin/mpif90
FC=mpif90
FCL=$(FC)

#
# General FORTRAN flags  (there must a trailing blank on this line)
FFLAGS =  -FR -lowercase -assume byterecl

#
# Optimization
OFLAG=-O3 -xW
OFLAG_HIGH = $(OFLAG)
OBJ_HIGH =
OBJ_NOOPT =
DEBUG  = -FR -O0
INLINE = $(OFLAG)

#
# Following lines specify the position of BLAS  and LAPACK on P4
# Use the MKL Intel libraries for P4 (www.intel.com)
# set -DRPROMU_DGEMV  -DRACCMU_DGEMV in the CPP lines
BLAS=-L/home/local/intel/ctce/3.1.1.002/cmkl/10.0.3.020/lib/em64t -lmkl
BLACS=-L/home/local/intel/ctce/3.1.1.002/cmkl/10.0.3.020/lib/em64t -lmkl_blacs_lp64
GUIDE=-L/home/local/intel/ctce/3.1.1.002/cmkl/10.0.3.020/lib/em64t -lguide
SCALAPACK=-L/home/local/intel/ctce/3.1.1.002/cmkl/10.0.3.020/lib/em64t -lmkl_scalapack

#
# Use LAPACK supplied by Intel MKL as well as by VASP Libraries
# LAPACK=-L/home/local/intel/ctce/3.1.1.002/cmkl/10.0.3.020/lib/em64t -lmkl_lapack64 \
#        ../vasp.4.lib/lapack_double.o
LAPACK0=/home/local/intel/ctce/3.1.1.002/cmkl/10.0.3.020/lib/em64t/libmkl_lapack.so
LAPACK1=../vasp.4.lib/lapack_double.o

#
# Compiler version 7.0 generates some vector statments which are located
# in the svml library, add the LIBPATH and the library (just in case)
LINK=-L/home/local/intel/ctce/3.1.1.002/fce/10.1.015/lib -lsvml

#
CPP_ =  ./preprocess <$*.F | /usr/bin/cpp -P -C -traditional >$*$(SUFFIX)

#
# Additional options for CPP in parallel version (see also above):
# NGZhalf               charge density   reduced in Z direction
# wNGZhalf              gamma point only reduced in Z direction
# scaLAPACK             use scaLAPACK (usually slower on 100 Mbit Net)
CPP    = $(CPP_) -DMPI  -DHOST=\"RAMA_em64t_ROCKS421_RHELAS44_ICTCE311\" -DIFC \
        -Dkind8 -DNGZhalf -DCACHE_SIZE=4000 -DPGF90 -Davoidalloc \
        -DMPI_BLOCK=500 -DPROC_GROUP=8 -DRPROMU_DGEMV -DRACCMU_DGEMV

SCA=

#
# Libraries for MPI
LIB  = -L../vasp.4.lib -ldmy ../vasp.4.lib/linpack_double.o \
       $(LAPACK0) $(LAPACK1) $(SCA) $(BLAS) $(BLACS) $(GUIDE) $(PTHREAD) $(SCALAPACK)

#
# FFT Libraries
# FFT: fftmpi.o with fft3dlib of Juergen Furthmueller
FFT3D   = fftmpi.o fftmpi_map.o fft3dlib.o

#
# General rules and compile lines
BASIC=   symmetry.o symlib.o   lattlib.o  random.o

SOURCE=  base.o     mpi.o      smart_allocate.o      xml.o  \
         constant.o jacobi.o   main_mpi.o  scala.o   \
         asa.o      lattice.o  poscar.o   ini.o      setex.o     radial.o  \
         pseudo.o   mgrid.o    mkpoints.o wave.o      wave_mpi.o  $(BASIC) \
         nonl.o     nonlr.o    dfast.o    choleski2.o    \
         mix.o      charge.o   xcgrad.o   xcspin.o    potex1.o   potex2.o  \
         metagga.o  constrmag.o pot.o      cl_shift.o force.o    dos.o      elf.o      \
         tet.o      hamil.o    steep.o    \
         chain.o    dyna.o     relativistic.o LDApU.o sphpro.o  paw.o   us.o \
         ebs.o      wavpre.o   wavpre_noio.o broyden.o \
         dynbr.o    rmm-diis.o reader.o   writer.o   tutor.o xml_writer.o \
         brent.o    stufak.o   fileio.o   opergrid.o stepver.o  \
         dipol.o    xclib.o    chgloc.o   subrot.o   optreal.o   davidson.o \
         edtest.o   electron.o shm.o      pardens.o  paircorrection.o \
         optics.o   constr_cell_relax.o   stm.o    finite_diff.o \
         elpol.o    setlocalpp.o aedens.o


INC=

vasp: $(SOURCE) $(FFT3D) $(INC) main.o
        rm -f vasp
        $(FCL) -o vasp $(LINK) main.o  $(SOURCE)   $(FFT3D) $(LIB)
makeparam: $(SOURCE) $(FFT3D) makeparam.o main.F $(INC)
        $(FCL) -o makeparam  $(LINK) makeparam.o $(SOURCE) $(FFT3D) $(LIB)
zgemmtest: zgemmtest.o base.o random.o $(INC)
        $(FCL) -o zgemmtest $(LINK) zgemmtest.o random.o base.o $(LIB)
dgemmtest: dgemmtest.o base.o random.o $(INC)
        $(FCL) -o dgemmtest $(LINK) dgemmtest.o random.o base.o $(LIB)
ffttest: base.o smart_allocate.o mpi.o mgrid.o random.o ffttest.o $(FFT3D) $(INC)
        $(FCL) -o ffttest $(LINK) ffttest.o mpi.o mgrid.o random.o smart_allocate.o base.o $(FFT3D) $(LIB)
kpoints: $(SOURCE) $(FFT3D) makekpoints.o main.F $(INC)
        $(FCL) -o kpoints $(LINK) makekpoints.o $(SOURCE) $(FFT3D) $(LIB)

clean:
        -rm -f *.g *.f *.o *.L *.mod ; touch *.F

main.o: main$(SUFFIX)
        $(FC) $(FFLAGS)$(DEBUG)  $(INCS) -c main$(SUFFIX)
xcgrad.o: xcgrad$(SUFFIX)
        $(FC) $(FFLAGS) $(INLINE)  $(INCS) -c xcgrad$(SUFFIX)
xcspin.o: xcspin$(SUFFIX)
        $(FC) $(FFLAGS) $(INLINE)  $(INCS) -c xcspin$(SUFFIX)

makeparam.o: makeparam$(SUFFIX)
        $(FC) $(FFLAGS)$(DEBUG)  $(INCS) -c makeparam$(SUFFIX)

makeparam$(SUFFIX): makeparam.F main.F

#
# MIND: I do not have a full dependency list for the include
# and MODULES: here are only the minimal basic dependencies
# if one strucuture is changed then touch_dep must be called
# with the corresponding name of the structure
#
base.o: base.inc base.F
mgrid.o: mgrid.inc mgrid.F
constant.o: constant.inc constant.F
lattice.o: lattice.inc lattice.F
setex.o: setexm.inc setex.F
pseudo.o: pseudo.inc pseudo.F
poscar.o: poscar.inc poscar.F
mkpoints.o: mkpoints.inc mkpoints.F
wave.o: wave.inc wave.F
nonl.o: nonl.inc nonl.F
nonlr.o: nonlr.inc nonlr.F

$(OBJ_HIGH):
        $(CPP)
        $(FC) $(FFLAGS) $(OFLAG_HIGH) $(INCS) -c $*$(SUFFIX)
$(OBJ_NOOPT):
        $(CPP)
        $(FC) $(FFLAGS) $(INCS) -c $*$(SUFFIX)

fft3dlib_f77.o: fft3dlib_f77.F
        $(CPP)
        $(F77) $(FFLAGS_F77) -c $*$(SUFFIX)

.F.o:
        $(CPP)
        $(FC) $(FFLAGS) $(OFLAG) $(INCS) -c $*$(SUFFIX)
.F$(SUFFIX):
        $(CPP)
$(SUFFIX).o:
        $(FC) $(FFLAGS) $(OFLAG) $(INCS) -c $*$(SUFFIX)

# special rules
#-----------------------------------------------------------------------
# these special rules are cummulative (that is once failed
#   in one compiler version, stays in the list forever)
# -tpp5|6|7 P, PII-PIII, PIV
# -xW use SIMD (does not pay of on PII, since fft3d uses double prec)
# all other options do no affect the code performance since -O1 is used
#-----------------------------------------------------------------------

fft3dlib.o : fft3dlib.F
        $(CPP)
#       $(FC) -FR -lowercase -O1 -tpp7 -xW -prefetch- -unroll0 -e95 -vec_report3 -c $*$(SUFFIX)
# http://cms.mpi.univie.ac.at/vasp-forum/forum_viewtopic.php?3.2324
        $(FC) -FR -lowercase -O1 -xW -prefetch- -unroll0 -vec_report3 -c $*$(SUFFIX)
fft3dfurth.o : fft3dfurth.F
        $(CPP)
        $(FC) -FR -lowercase -O1 -c $*$(SUFFIX)

radial.o : radial.F
        $(CPP)
        $(FC) -FR -lowercase -O1 -c $*$(SUFFIX)

symlib.o : symlib.F
        $(CPP)
        $(FC) -FR -lowercase -O1 -c $*$(SUFFIX)

symmetry.o : symmetry.F
        $(CPP)
        $(FC) -FR -lowercase -O1 -c $*$(SUFFIX)

dynbr.o : dynbr.F
        $(CPP)
        $(FC) -FR -lowercase -O1 -c $*$(SUFFIX)

broyden.o : broyden.F

        $(CPP)
        $(FC) -FR -lowercase -O2 -c $*$(SUFFIX)

us.o : us.F
        $(CPP)
        $(FC) -FR -lowercase -O1 -c $*$(SUFFIX)

wave.o : wave.F
        $(CPP)
        $(FC) -FR -lowercase -O0 -c $*$(SUFFIX)

LDApU.o : LDApU.F
        $(CPP)
        $(FC) -FR -lowercase -O2 -c $*$(SUFFIX)
# Added to overcome LAPACK ZPORTF Routine failure
# http://cms.mpi.univie.ac.at/vasp-forum/forum_viewtopic.php?2.10
# Gowtham, Wed Dec 21 23:13:48 EST 2005
mpi.o : mpi.F
        $(CPP)
        $(FC) -FR -lowercase -O0 -c $*$(SUFFIX)

VASP 4.6.31 Parallel : Calculation hangs at 'parallel 3dFFT wavefunction'

Posted: Thu Jul 03, 2008 12:54 pm
by admin
the error indicates that svml_trunc2 subroutine is not found/ the call does not work. The libsvml should be added as one of the BLAS libraries, please check where it is installed on your computer (usually in the .../lib/...
subdirectory of the directory where the INTEL compiler has been installed. If you have found it, please check whether the svml_trunc2 routine is included (either by 'ar tv libsvml.a | grep -i svml_trunc2' or 'strings libsvml.so | grep -i svml_trunc2'). If it is not, please use a different BLAS (like K Goto's BLAS,..)

VASP 4.6.31 Parallel : Calculation hangs at 'parallel 3dFFT wavefunction'

Posted: Tue Sep 30, 2008 12:05 pm
by peterklaver
Hmmm, trying to compile vasp 4.6.35 on a P4 with the latest ifort and mkl also gives me problems with svml_trunc2 and also the svml_pow2 routines which are not resolved at link time. With the current ifc version, there is no libsvml. There are a few other libraries like libmkl_vml_p4.so, but none contain the trunc2 or pow2 routines. The ifort version was installed just prior to compiling, 10.0.4.023, so that should be up-to-date. Or did Intel introduce a small bug in their latest version? Has anyone compiled VASP on a P4 with the latest ifort recently?

As svml is related to SSE, SSE2, I tried setting the target architecture a plain old pentium first generation, but that didn't fix it.

Any suggestions please?

VASP 4.6.31 Parallel : Calculation hangs at 'parallel 3dFFT wavefunction'

Posted: Tue Sep 30, 2008 1:52 pm
by peterklaver
update: using some very dirty mixing of old and new versions of compiler libraries, I've managed to get it working. Since it works by using some old libs, it does strengthen my suspicion that the latest ifort libs have a minor bug in not resolving svml_trunc2 and svml_pow2.