Dear all,
I am experiencing a similar problem to what was reported in a recent thread. I am trying to relax the atomic positions (ionmov 2, optcell 0) of a moderately
large cell (80 atoms) on 96 processors with autoparal 1. What I see (see attached output file) is that the number of SCF iterations that are needed to converge the ground state is more or less stable for about 5 Broyden iterations; then, it becomes "infinite" (the energy starts oscillating without ever converging).
Now, I think this is a bug, for the following reasons:
1) The system is insulating with a large gap (2 eV), and does not show any tendency towards gap closure during relaxation.
2) Indeed, if I cut&paste the last "useful" snapshot of xred (the configuration whose SCF ground state failed to converge) into the input file and restart the calculation, everything runs perfectly smoothly for 5-6 Broyden iterations, then SCF convergence goes nuts again.
3) I tried two different schemes for the SCF mixing: iscf 7 (default) and a much more conservative choice (iscf=2, diemac=4, diemix=0.5). Both choices lead to essentially the same outcome.
I don't know exactly how ABINIT behaves between one Broyden cycle to the next: Does it progressively "learn" and use information about dielectric screening from the previous iterations? But then with iscf=2 this shouldn't be an issue, right?
Any help would be appreciated.
Best,
Max
Progressive worsening of SCF convergence [SOLVED]
Moderator: bguster
Progressive worsening of SCF convergence
- Attachments
-
- srtio3.out
- (313.37 KiB) Downloaded 237 times
Re: Progressive worsening of SCF convergence
Dear All,
I have made further progress at tracking down the bug. I ran the same input file (pasted below) with two different partitions:
(a) 96 cores, autoparal=1
(b) 6 cores, standard parallelization over the 6 k-points
The results are quite different. Here are the values I get for the first 6 Broyden moves:
(a)
ETOT 58 -2486.9458802928 -5.230E-11 5.554E-07 5.848E-07 3.125E-06 5.663E-04
ETOT 29 -2486.9458919044 -5.957E-11 3.942E-07 1.468E-06 3.463E-06 5.202E-04
ETOT 28 -2486.9459087486 -3.911E-11 9.432E-08 3.966E-07 3.457E-06 3.839E-04
ETOT 23 -2486.9459148730 -4.138E-11 3.381E-08 3.616E-07 3.475E-06 4.193E-04
ETOT 26 -2486.9459324610 -6.185E-11 2.435E-08 1.139E-06 3.674E-06 3.991E-04
ETOT 31 -2486.9459410740 -1.488E-09 7.321E-09 1.670E-06 3.821E-06 3.867E-04
(b)
ETOT 17 -2486.9458802916 -2.659E-09 3.005E-04 4.932E-05 2.304E-06 5.652E-04
ETOT 11 -2486.9458918970 2.547E-11 3.554E-05 6.081E-07 8.556E-07 5.206E-04
ETOT 12 -2486.9459087517 -5.366E-11 1.348E-05 8.021E-07 2.213E-06 3.850E-04
ETOT 11 -2486.9459148632 -6.821E-11 5.067E-05 5.954E-07 1.083E-06 4.202E-04
ETOT 12 -2486.9459326088 -1.205E-10 2.671E-05 1.320E-06 2.177E-06 3.990E-04
ETOT 11 -2486.9459418075 -1.346E-10 3.430E-06 1.406E-06 1.334E-06 3.876E-04
As you see, even if the results at convergence seem to match, the number of SCF iterations that are needed to converge the electronic ground state is typically about 2-3 times larger in (a) than in (b). Moreover, after 6 iterations, (a) no longer converges, while (b) keeps running smoothly as it should. Clearly, there seems to be something going wrong with the parallelization. I also had to set prtwf=0 otherwise the (a) run would hang while writing the wavefunctions. Any help/advice would be appreciated.
Thanks,
Max
--------input file follows---------
# These parameters are specific to the GS calculation
kptopt 1
toldff 5.d-6
prtwf 0
ionmov 2
ntime 20
iscf 2
#autoparal 1
# Reduced coordinates
xred 0.0000000000E+00 2.2060047445E-03 3.8315596072E-02
0.0000000000E+00 1.2423922811E-01 5.4674259111E-01
5.0000000000E-01 -1.3166509999E-03 4.3407078227E-02
5.0000000000E-01 1.2650651518E-01 5.4664657573E-01
0.0000000000E+00 2.5076077190E-01 4.6742591111E-02
0.0000000000E+00 3.7279399527E-01 5.3831559607E-01
5.0000000000E-01 2.4849348483E-01 4.6646575726E-02
5.0000000000E-01 3.7631665101E-01 5.4340707823E-01
0.0000000000E+00 5.0220600474E-01 -3.8315596072E-02
0.0000000000E+00 6.2423922811E-01 4.5325740889E-01
5.0000000000E-01 4.9868334900E-01 -4.3407078227E-02
5.0000000000E-01 6.2650651518E-01 4.5335342427E-01
0.0000000000E+00 7.5076077190E-01 -4.6742591111E-02
0.0000000000E+00 8.7279399527E-01 4.6168440393E-01
5.0000000000E-01 7.4849348483E-01 -4.6646575726E-02
5.0000000000E-01 8.7631665101E-01 4.5659292177E-01
2.5070678128E-01 -1.6712876201E-04 5.0667219420E-01
2.5067143647E-01 1.2496053139E-01 6.8631548103E-03
7.4929321872E-01 -1.6712876201E-04 5.0667219420E-01
7.4932856353E-01 1.2496053139E-01 6.8631548103E-03
2.5067143647E-01 2.5003946862E-01 5.0686315481E-01
2.5070678128E-01 3.7516712877E-01 6.6721941968E-03
7.4932856353E-01 2.5003946862E-01 5.0686315481E-01
7.4929321872E-01 3.7516712877E-01 6.6721941968E-03
2.5070678128E-01 4.9983287124E-01 4.9332780580E-01
2.5067143647E-01 6.2496053139E-01 -6.8631548103E-03
7.4929321872E-01 4.9983287124E-01 4.9332780580E-01
7.4932856353E-01 6.2496053139E-01 -6.8631548103E-03
2.5067143647E-01 7.5003946862E-01 4.9313684519E-01
2.5070678128E-01 8.7516712877E-01 -6.6721941968E-03
7.4932856353E-01 7.5003946862E-01 4.9313684519E-01
7.4929321872E-01 8.7516712877E-01 -6.6721941968E-03
0.0000000000E+00 -1.5717051875E-02 4.5296302737E-01
0.0000000000E+00 1.3895966156E-01 -3.1721148332E-02
2.2041950405E-01 5.6460624777E-02 2.1954010094E-01
2.1586181514E-01 6.7699343126E-02 7.2080043186E-01
2.7551465408E-01 -6.2499999995E-02 7.5000000000E-01
2.9965545511E-01 -6.2499999995E-02 2.5000000000E-01
5.0000000000E-01 1.7000654308E-02 4.7949707472E-01
5.0000000000E-01 1.0992402812E-01 -3.4407269129E-02
7.7958049595E-01 5.6460624777E-02 2.1954010094E-01
7.8413818486E-01 6.7699343126E-02 7.2080043186E-01
7.2448534592E-01 -6.2499999995E-02 7.5000000000E-01
7.0034454489E-01 -6.2499999995E-02 2.5000000000E-01
0.0000000000E+00 2.3604033845E-01 4.6827885167E-01
0.0000000000E+00 3.9071705188E-01 -4.7036972630E-02
2.1586181514E-01 3.0730065688E-01 2.2080043186E-01
2.2041950405E-01 3.1853937523E-01 7.1954010094E-01
2.8330920875E-01 1.8691872385E-01 7.4071714921E-01
2.8330920875E-01 1.8808127616E-01 2.4071714921E-01
5.0000000000E-01 2.6507597189E-01 4.6559273087E-01
5.0000000000E-01 3.5799934570E-01 -2.0502925275E-02
7.8413818486E-01 3.0730065688E-01 2.2080043186E-01
7.7958049595E-01 3.1853937523E-01 7.1954010094E-01
7.1669079125E-01 1.8691872385E-01 7.4071714921E-01
7.1669079125E-01 1.8808127616E-01 2.4071714921E-01
0.0000000000E+00 4.8428294813E-01 5.4703697263E-01
0.0000000000E+00 6.3895966156E-01 3.1721148332E-02
2.1586181514E-01 5.6769934313E-01 2.7919956814E-01
2.2041950405E-01 5.5646062478E-01 7.8045989906E-01
2.9965545511E-01 4.3750000000E-01 7.5000000000E-01
2.7551465408E-01 4.3750000000E-01 2.5000000000E-01
5.0000000000E-01 5.1700065431E-01 5.2050292528E-01
5.0000000000E-01 6.0992402812E-01 3.4407269129E-02
7.8413818486E-01 5.6769934313E-01 2.7919956814E-01
7.7958049595E-01 5.5646062478E-01 7.8045989906E-01
7.0034454489E-01 4.3750000000E-01 7.5000000000E-01
7.2448534592E-01 4.3750000000E-01 2.5000000000E-01
0.0000000000E+00 7.3604033845E-01 5.3172114833E-01
0.0000000000E+00 8.9071705188E-01 4.7036972630E-02
2.2041950405E-01 8.1853937523E-01 2.8045989906E-01
2.1586181514E-01 8.0730065688E-01 7.7919956814E-01
2.8330920875E-01 6.8808127616E-01 7.5928285079E-01
2.8330920875E-01 6.8691872385E-01 2.5928285079E-01
5.0000000000E-01 7.6507597189E-01 5.3440726913E-01
5.0000000000E-01 8.5799934570E-01 2.0502925275E-02
7.7958049595E-01 8.1853937523E-01 2.8045989906E-01
7.8413818486E-01 8.0730065688E-01 7.7919956814E-01
7.1669079125E-01 6.8808127616E-01 7.5928285079E-01
7.1669079125E-01 6.8691872385E-01 2.5928285079E-01
# Part of the perovskite input file that contains material-specific
# declarations, i.e. species, k-point grid, cell parameters, etc.
#Definition of the unit cell
acell 3*7.7439
rprim
2.000000000000 0.000000000000 0.000000000000
0.000000000000 5.656860049491 0.000000000000
0.000000000000 0.000000000000 1.414215012373
# This might help convergence a bit
diemac 4.0
diemix 0.5
#Definition of the atom types (PbZrO3)
ntypat 3
znucl 82 40 8
#Definition of the atoms
natom 80 # There is only one atom per cell
typat 16*1 16*2 48*3 # This atom is of type 1, that is, Aluminum
#Definition of the planewave basis set
ecut 60.0 # Maximal kinetic energy cut-off, in Hartree
#Exchange-correlation functional
ixc 7 # LDA CA-PW92
#Definition of the k-point grid
ngkpt 4 2 6
nshiftk 1
shiftk 0.5 0.5 0.5
#Definition of the SCF procedure
nstep 100 # Maximal number of SCF cycles
I have made further progress at tracking down the bug. I ran the same input file (pasted below) with two different partitions:
(a) 96 cores, autoparal=1
(b) 6 cores, standard parallelization over the 6 k-points
The results are quite different. Here are the values I get for the first 6 Broyden moves:
(a)
ETOT 58 -2486.9458802928 -5.230E-11 5.554E-07 5.848E-07 3.125E-06 5.663E-04
ETOT 29 -2486.9458919044 -5.957E-11 3.942E-07 1.468E-06 3.463E-06 5.202E-04
ETOT 28 -2486.9459087486 -3.911E-11 9.432E-08 3.966E-07 3.457E-06 3.839E-04
ETOT 23 -2486.9459148730 -4.138E-11 3.381E-08 3.616E-07 3.475E-06 4.193E-04
ETOT 26 -2486.9459324610 -6.185E-11 2.435E-08 1.139E-06 3.674E-06 3.991E-04
ETOT 31 -2486.9459410740 -1.488E-09 7.321E-09 1.670E-06 3.821E-06 3.867E-04
(b)
ETOT 17 -2486.9458802916 -2.659E-09 3.005E-04 4.932E-05 2.304E-06 5.652E-04
ETOT 11 -2486.9458918970 2.547E-11 3.554E-05 6.081E-07 8.556E-07 5.206E-04
ETOT 12 -2486.9459087517 -5.366E-11 1.348E-05 8.021E-07 2.213E-06 3.850E-04
ETOT 11 -2486.9459148632 -6.821E-11 5.067E-05 5.954E-07 1.083E-06 4.202E-04
ETOT 12 -2486.9459326088 -1.205E-10 2.671E-05 1.320E-06 2.177E-06 3.990E-04
ETOT 11 -2486.9459418075 -1.346E-10 3.430E-06 1.406E-06 1.334E-06 3.876E-04
As you see, even if the results at convergence seem to match, the number of SCF iterations that are needed to converge the electronic ground state is typically about 2-3 times larger in (a) than in (b). Moreover, after 6 iterations, (a) no longer converges, while (b) keeps running smoothly as it should. Clearly, there seems to be something going wrong with the parallelization. I also had to set prtwf=0 otherwise the (a) run would hang while writing the wavefunctions. Any help/advice would be appreciated.
Thanks,
Max
--------input file follows---------
# These parameters are specific to the GS calculation
kptopt 1
toldff 5.d-6
prtwf 0
ionmov 2
ntime 20
iscf 2
#autoparal 1
# Reduced coordinates
xred 0.0000000000E+00 2.2060047445E-03 3.8315596072E-02
0.0000000000E+00 1.2423922811E-01 5.4674259111E-01
5.0000000000E-01 -1.3166509999E-03 4.3407078227E-02
5.0000000000E-01 1.2650651518E-01 5.4664657573E-01
0.0000000000E+00 2.5076077190E-01 4.6742591111E-02
0.0000000000E+00 3.7279399527E-01 5.3831559607E-01
5.0000000000E-01 2.4849348483E-01 4.6646575726E-02
5.0000000000E-01 3.7631665101E-01 5.4340707823E-01
0.0000000000E+00 5.0220600474E-01 -3.8315596072E-02
0.0000000000E+00 6.2423922811E-01 4.5325740889E-01
5.0000000000E-01 4.9868334900E-01 -4.3407078227E-02
5.0000000000E-01 6.2650651518E-01 4.5335342427E-01
0.0000000000E+00 7.5076077190E-01 -4.6742591111E-02
0.0000000000E+00 8.7279399527E-01 4.6168440393E-01
5.0000000000E-01 7.4849348483E-01 -4.6646575726E-02
5.0000000000E-01 8.7631665101E-01 4.5659292177E-01
2.5070678128E-01 -1.6712876201E-04 5.0667219420E-01
2.5067143647E-01 1.2496053139E-01 6.8631548103E-03
7.4929321872E-01 -1.6712876201E-04 5.0667219420E-01
7.4932856353E-01 1.2496053139E-01 6.8631548103E-03
2.5067143647E-01 2.5003946862E-01 5.0686315481E-01
2.5070678128E-01 3.7516712877E-01 6.6721941968E-03
7.4932856353E-01 2.5003946862E-01 5.0686315481E-01
7.4929321872E-01 3.7516712877E-01 6.6721941968E-03
2.5070678128E-01 4.9983287124E-01 4.9332780580E-01
2.5067143647E-01 6.2496053139E-01 -6.8631548103E-03
7.4929321872E-01 4.9983287124E-01 4.9332780580E-01
7.4932856353E-01 6.2496053139E-01 -6.8631548103E-03
2.5067143647E-01 7.5003946862E-01 4.9313684519E-01
2.5070678128E-01 8.7516712877E-01 -6.6721941968E-03
7.4932856353E-01 7.5003946862E-01 4.9313684519E-01
7.4929321872E-01 8.7516712877E-01 -6.6721941968E-03
0.0000000000E+00 -1.5717051875E-02 4.5296302737E-01
0.0000000000E+00 1.3895966156E-01 -3.1721148332E-02
2.2041950405E-01 5.6460624777E-02 2.1954010094E-01
2.1586181514E-01 6.7699343126E-02 7.2080043186E-01
2.7551465408E-01 -6.2499999995E-02 7.5000000000E-01
2.9965545511E-01 -6.2499999995E-02 2.5000000000E-01
5.0000000000E-01 1.7000654308E-02 4.7949707472E-01
5.0000000000E-01 1.0992402812E-01 -3.4407269129E-02
7.7958049595E-01 5.6460624777E-02 2.1954010094E-01
7.8413818486E-01 6.7699343126E-02 7.2080043186E-01
7.2448534592E-01 -6.2499999995E-02 7.5000000000E-01
7.0034454489E-01 -6.2499999995E-02 2.5000000000E-01
0.0000000000E+00 2.3604033845E-01 4.6827885167E-01
0.0000000000E+00 3.9071705188E-01 -4.7036972630E-02
2.1586181514E-01 3.0730065688E-01 2.2080043186E-01
2.2041950405E-01 3.1853937523E-01 7.1954010094E-01
2.8330920875E-01 1.8691872385E-01 7.4071714921E-01
2.8330920875E-01 1.8808127616E-01 2.4071714921E-01
5.0000000000E-01 2.6507597189E-01 4.6559273087E-01
5.0000000000E-01 3.5799934570E-01 -2.0502925275E-02
7.8413818486E-01 3.0730065688E-01 2.2080043186E-01
7.7958049595E-01 3.1853937523E-01 7.1954010094E-01
7.1669079125E-01 1.8691872385E-01 7.4071714921E-01
7.1669079125E-01 1.8808127616E-01 2.4071714921E-01
0.0000000000E+00 4.8428294813E-01 5.4703697263E-01
0.0000000000E+00 6.3895966156E-01 3.1721148332E-02
2.1586181514E-01 5.6769934313E-01 2.7919956814E-01
2.2041950405E-01 5.5646062478E-01 7.8045989906E-01
2.9965545511E-01 4.3750000000E-01 7.5000000000E-01
2.7551465408E-01 4.3750000000E-01 2.5000000000E-01
5.0000000000E-01 5.1700065431E-01 5.2050292528E-01
5.0000000000E-01 6.0992402812E-01 3.4407269129E-02
7.8413818486E-01 5.6769934313E-01 2.7919956814E-01
7.7958049595E-01 5.5646062478E-01 7.8045989906E-01
7.0034454489E-01 4.3750000000E-01 7.5000000000E-01
7.2448534592E-01 4.3750000000E-01 2.5000000000E-01
0.0000000000E+00 7.3604033845E-01 5.3172114833E-01
0.0000000000E+00 8.9071705188E-01 4.7036972630E-02
2.2041950405E-01 8.1853937523E-01 2.8045989906E-01
2.1586181514E-01 8.0730065688E-01 7.7919956814E-01
2.8330920875E-01 6.8808127616E-01 7.5928285079E-01
2.8330920875E-01 6.8691872385E-01 2.5928285079E-01
5.0000000000E-01 7.6507597189E-01 5.3440726913E-01
5.0000000000E-01 8.5799934570E-01 2.0502925275E-02
7.7958049595E-01 8.1853937523E-01 2.8045989906E-01
7.8413818486E-01 8.0730065688E-01 7.7919956814E-01
7.1669079125E-01 6.8808127616E-01 7.5928285079E-01
7.1669079125E-01 6.8691872385E-01 2.5928285079E-01
# Part of the perovskite input file that contains material-specific
# declarations, i.e. species, k-point grid, cell parameters, etc.
#Definition of the unit cell
acell 3*7.7439
rprim
2.000000000000 0.000000000000 0.000000000000
0.000000000000 5.656860049491 0.000000000000
0.000000000000 0.000000000000 1.414215012373
# This might help convergence a bit
diemac 4.0
diemix 0.5
#Definition of the atom types (PbZrO3)
ntypat 3
znucl 82 40 8
#Definition of the atoms
natom 80 # There is only one atom per cell
typat 16*1 16*2 48*3 # This atom is of type 1, that is, Aluminum
#Definition of the planewave basis set
ecut 60.0 # Maximal kinetic energy cut-off, in Hartree
#Exchange-correlation functional
ixc 7 # LDA CA-PW92
#Definition of the k-point grid
ngkpt 4 2 6
nshiftk 1
shiftk 0.5 0.5 0.5
#Definition of the SCF procedure
nstep 100 # Maximal number of SCF cycles
Re: Progressive worsening of SCF convergence [SOLVED]
Hi Max,
This looks like the problem related to the compilation optimization flags you used.
See for example the following posts:
https://forum.abinit.org/viewtopic.php?f=9&t=3505
https://forum.abinit.org/viewtopic.php?f=9&t=4028
Did you use the --enable-avx-safe-mode?
Cheers,
Eric
This looks like the problem related to the compilation optimization flags you used.
See for example the following posts:
https://forum.abinit.org/viewtopic.php?f=9&t=3505
https://forum.abinit.org/viewtopic.php?f=9&t=4028
Did you use the --enable-avx-safe-mode?
Cheers,
Eric
Re: Progressive worsening of SCF convergence
Eric,
Thanks a lot for the advice! I recompiled with -O2 and --enable-avx-safe-mode and things seem to work fine now. Of course, the kgb algorithm is still significantly less efficient (in terms of iterations) than the serial band-by-band minimization (I guess one needs to live with that), but I no longer get the abrupt degradation of the SCF loop after iteration 5/6.
Best,
Max
Thanks a lot for the advice! I recompiled with -O2 and --enable-avx-safe-mode and things seem to work fine now. Of course, the kgb algorithm is still significantly less efficient (in terms of iterations) than the serial band-by-band minimization (I guess one needs to live with that), but I no longer get the abrupt degradation of the SCF loop after iteration 5/6.
Best,
Max