VASP6 is a few times slower than 5.4 on AMD CPU
Posted: Fri Jul 19, 2024 8:31 am
Hi,
After a fast installation of VASP on my machine, I noticed a performance issue. The test job was incomplete for several hours, so I interrupted it. I ran my small task on two versions of VASP. In version 5.4, the task was completed in 1 minute. In version 6, the program did not even enter the main loop after this time. I found a post on the forum stating that it relates to mpirun and the command I_MPI_FABRICS=shm vasp_std, but this solution does not work for me. Despite assigning the calculations to 32 CPUs (mpirun -np 32), VASP spreads across all 64 threads. How can I solve this problem? Below, I attach the makefile.include. My CPU is AMD Ryzen Threadripper PRO 5975WX 32-Cores, and I have only Gnu libraries. Thanks in advance!
After a fast installation of VASP on my machine, I noticed a performance issue. The test job was incomplete for several hours, so I interrupted it. I ran my small task on two versions of VASP. In version 5.4, the task was completed in 1 minute. In version 6, the program did not even enter the main loop after this time. I found a post on the forum stating that it relates to mpirun and the command I_MPI_FABRICS=shm vasp_std, but this solution does not work for me. Despite assigning the calculations to 32 CPUs (mpirun -np 32), VASP spreads across all 64 threads. How can I solve this problem? Below, I attach the makefile.include. My CPU is AMD Ryzen Threadripper PRO 5975WX 32-Cores, and I have only Gnu libraries. Thanks in advance!
Code: Select all
# Default precompiler options
CPP_OPTIONS = -DHOST=\"LinuxGNU\" \
-DMPI -DMPI_BLOCK=8000 -Duse_collective \
-DscaLAPACK \
-DCACHE_SIZE=4000 \
-Davoidalloc \
-Dvasp6 \
-Duse_bse_te \
-Dtbdyn \
-Dfock_dblbuf
CPP = gcc -E -C -w $*$(FUFFIX) >$*$(SUFFIX) $(CPP_OPTIONS)
FC = mpif90
FCL = mpif90
FREE = -ffree-form -ffree-line-length-none
FFLAGS = -w -ffpe-summary=none
OFLAG = -O2
OFLAG_IN = $(OFLAG)
DEBUG = -O0
OBJECTS = fftmpiw.o fftmpi_map.o fftw3d.o fft3dlib.o
OBJECTS_O1 += fftw3d.o fftmpi.o fftmpiw.o
OBJECTS_O2 += fft3dlib.o
# For what used to be vasp.5.lib
CPP_LIB = $(CPP)
FC_LIB = $(FC)
CC_LIB = gcc
CFLAGS_LIB = -O
FFLAGS_LIB = -O1
FREE_LIB = $(FREE)
OBJECTS_LIB = linpack_double.o
# For the parser library
CXX_PARS = g++
LLIBS = -lstdc++
##
## Customize as of this point! Of course you may change the preceding
## part of this file as well if you like, but it should rarely be
## necessary ...
##
# When compiling on the target machine itself, change this to the
# relevant target when cross-compiling for another architecture
VASP_TARGET_CPU ?= -march=native
FFLAGS += $(VASP_TARGET_CPU)
# For gcc-10 and higher (comment out for older versions)
FFLAGS += -fallow-argument-mismatch
# BLAS and LAPACK (mandatory)
OPENBLAS_ROOT ?= /programs/lapack-3.11/
BLASPACK = -L$(OPENBLAS_ROOT)/lib -lopenblas
# scaLAPACK (mandatory)
SCALAPACK_ROOT ?= /programs/scalapack-2.2.0
SCALAPACK = -L$(SCALAPACK_ROOT) -lscalapack
LLIBS += $(SCALAPACK) $(BLASPACK)
# FFTW (mandatory)
FFTW_ROOT ?= /usr/include
LLIBS += -L$(FFTW_ROOT)/lib -lfftw3
INCS += -I$(FFTW_ROOT)/
# HDF5-support (optional but strongly recommended)
CPP_OPTIONS+= -DVASP_HDF5
HDF5_ROOT ?= /usr/lib/x86_64-linux-gnu/hdf5/serial
LLIBS += -L$(HDF5_ROOT)/ -lhdf5_fortran
INCS += -I$(HDF5_ROOT)/include
# For the VASP-2-Wannier90 interface (optional)
#CPP_OPTIONS += -DVASP2WANNIER90
#WANNIER90_ROOT ?= /path/to/your/wannier90/installation
#LLIBS += -L$(WANNIER90_ROOT)/lib -lwannier