Hello,
We are trying to compile abinit in parallel for a class, but we are not very familiar with abinit ourselves.
The serial version works fine, but in parallel we get:
getng.F90:431:BUG
The second dimension of the FFT grid, ngfft(2), should be a multiple of the number of processors for the FFT, nproc_fft. However, ngfft(2)= 20 and nproc_fft= 3
Action : contact ABINIT group.
Across 12 nodes, we get files dx_LOG_0001 through 0011 (not 0012, if this makes any difference) and dx_status_P-0001 through 0011.
The input file (below) is the same one that was used last year in class on abinit 5.7.3 in parallel on 4 nodes, 1 core each.
Is there anything about the input file that we should change for version 7.2.2?
Thank you,
Tam
-----------------------------------------------------------------------------------------
# Crystalline silicon : computation of the total energy
# Convergence with respect to the number of k points.
ndtset 4
#Definition of the k-point grids
kptopt 1 # Option for the automatic generation of k points, taking
# into account the symmetry
nshiftk 4
shiftk 0.5 0.5 0.5 # These shifts will be the same for all grids
0.5 0.0 0.0
0.0 0.5 0.0
0.0 0.0 0.5
ngkpt1 2 2 2 # Definition of the different grids
ngkpt2 4 4 4
ngkpt3 6 6 6
ngkpt4 8 8 8
getwfk -1 # This is to speed up the calculation, by restarting
# from previous wavefunctions, transferred from the old
# to the new k-points.
#Definition of the unit cell
acell 3*10.18 # This is equivalent to 10.18 10.18 10.18
rprim 0.0 0.5 0.5 # FCC primitive vectors (to be scaled by acell)
0.5 0.0 0.5
0.5 0.5 0.0
#Definition of the atom types
ntypat 1 # There is only one type of atom
znucl 14 # The keyword "znucl" refers to the atomic number of the
# possible type(s) of atom. The pseudopotential(s)
# mentioned in the "files" file must correspond
# to the type(s) of atom. Here, the only type is Silicon.
#Definition of the atoms
natom 2 # There are two atoms
typat 1 1 # They both are of type 1, that is, Silicon.
xred # This keyword indicate that the location of the atoms
# will follow, one triplet of number for each atom
0.0 0.0 0.0 # Triplet giving the REDUCED coordinate of atom 1.
1/4 1/4 1/4 # Triplet giving the REDUCED coordinate of atom 2.
#Definition of the planewave basis set
ecut 8.0 # Maximal kinetic energy cut-off, in Hartree
#Definition of the SCF procedure
nstep 10 # Maximal number of SCF cycles
toldfe 1.0d-6 # Will stop when, twice in a row, the difference
# between two consecutive evaluations of total energy
# differ by less than toldfe (in Hartree)
diemac 12.0 # Although this is not mandatory, it is worth to
# precondition the SCF cycle. The model dielectric
# function used as the standard preconditioner
# is described in the "dielng" input variable section.
# Here, we follow the prescription for bulk silicon.
getng.F90:431:BUG contact ABINIT group
Moderator: bguster
Re: getng.F90:431:BUG contact ABINIT group
Hello Tam
It's probably because abinit has automatically taken care of the parallelization parameters and has set NPFFT to 3, which is not a multiple of NGFFT(2)=20. In abinit 5.x, abinit would parallelize only on kpoints, while the 7.2.2 version activates the 3-level parallelization scheme by default. You have several options:
1. Change the number of cores you're running abinit with, hoping for a chance that you'll get a distribution that works
2. Configure yourself the parallelization parameters by setting NPFFT, NPKPT, and NPBAND "by hand", using for instance NPFFT=1.
3. Disable paral_kgb by setting paral_kgb=0. You'll then parallelize on kpoints only, which will avoid any issue with the number of cpus dedicated to the FFT grid.
If I were you, I'd use option 3 first. If you have a large system (lot of bands, few kpoints) then go with option 2 and set NPFFT=1, NPKPT = NSPPOL, and NPBAND = multiple of NBANDS.
Hope this helps
Boris
It's probably because abinit has automatically taken care of the parallelization parameters and has set NPFFT to 3, which is not a multiple of NGFFT(2)=20. In abinit 5.x, abinit would parallelize only on kpoints, while the 7.2.2 version activates the 3-level parallelization scheme by default. You have several options:
1. Change the number of cores you're running abinit with, hoping for a chance that you'll get a distribution that works
2. Configure yourself the parallelization parameters by setting NPFFT, NPKPT, and NPBAND "by hand", using for instance NPFFT=1.
3. Disable paral_kgb by setting paral_kgb=0. You'll then parallelize on kpoints only, which will avoid any issue with the number of cpus dedicated to the FFT grid.
If I were you, I'd use option 3 first. If you have a large system (lot of bands, few kpoints) then go with option 2 and set NPFFT=1, NPKPT = NSPPOL, and NPBAND = multiple of NBANDS.
Hope this helps
Boris
----------------------------------------------------------
Boris Dorado
Atomic Energy Commission
France
----------------------------------------------------------
Boris Dorado
Atomic Energy Commission
France
----------------------------------------------------------
Re: getng.F90:431:BUG contact ABINIT group
Hello Boris,
Thank you very much! Option 3 worked perfectly, and we will forward this information to the class so the students are informed about 1&2 as well.
Thanks again,
Tam
Thank you very much! Option 3 worked perfectly, and we will forward this information to the class so the students are informed about 1&2 as well.
Thanks again,
Tam
Re: getng.F90:431:BUG contact ABINIT group
Great!
You should mark the problem as solved. It should be available somewhere in your first post.
Boris
You should mark the problem as solved. It should be available somewhere in your first post.
Boris
----------------------------------------------------------
Boris Dorado
Atomic Energy Commission
France
----------------------------------------------------------
Boris Dorado
Atomic Energy Commission
France
----------------------------------------------------------