I have some difficulties with Abinit running in parallel. All my calculations seem to take more SCF cycles, when I run the job in parallel - and some jobs even fail to converge! Sometimes, when I do structural relaxation, the job will fail to converge at some specific time step when the job is run in parallel, but will finish without problems when run in serial (or more specifically, one MPI core).
I only seem to see the problem for 2D materials. I have never encountered the problem for bulk materials. I have not tried 1D materials or molecules.
The problem occurs in may different scenarios, but here's an input file that generates the error for me
Code: Select all
# Monolayer hcp Fe
# Output quantities
prtwf 0
prtdos 0
prtden 1
#spin related quantities
spinat 0.0 0.0 2.0
nsppol 2
nband 14
# Convergence stuff
tolvrs 1.0d-10
ecut 20.0
pawecutdg 40.
ngkpt 2*15 1
nshiftk 1
shiftk 0.0 0.0 0.0
nstep 200
occopt 3
tsmear 0.01
# Pseudo related
#iscf 17
ixc 1
#Geometry
acell 2*2.46 20 angstrom
rprim sqrt(3/4) 0.5 0
sqrt(3/4) -0.5 0
0.0 0.0 -1
natom 1
ntypat 1
typat 1
xred 0.0 0.0 0.0
znucl 26
When I run this input file, it converges in 65 SCF cycles when I run it on a single core, but fail to converge in 200 SCF cycles on both 2 and 4 cores.
I use Abinit 7.6.2 with OpemMPI and I have tried several things to solve the problem, including:
- Trying with another Abinit version (/.4.3)
- Compiling with either Intel Fortran or Gnu Fortran compilers (both Abinit and MPI)
- Compiling with either MKL or the netlib-fallback library.
- Compiling with or without FFTW3.
- Trying with another computer.
- Trying with both OpenMPI and MPICH2
The upload system on this site does not allow for me to include all of the test files, so I have uploaded them to my own website. I have included my input-file, my files-file and the pseudo I used for the calculations, including the log file for all three runs (1, 2 and 4 cores).
http://ricehigh.dk/problem_example.zip
Can anyone help me on how to proceed?