viewtopic.php?f=8&t=2023s
when running MPI jobs on abinit-7.6.3
I have nstep set to 200, the code reaches istep = 50, then exits claiming
Code: Select all
200 was not enough SCF cycles to converge;
potential residual= 7.382E-11 exceeds tolvrs= 1.000E-12
so this seems like the same bug as in the other post. I looked a bit deeper and printed out res2 and tolvrs in 67_common/scprqt.F90 during the loop (choice==2) and after the loop completed (choice==3).
Code: Select all
if( ttolvrs==1 .and. res2<tolvrs .and. (.not.noquit)) then
if (optres==0) then
write(message, '(a,a,i5,a,1p,e10.2,a,e10.2,a)' ) ch10,&
& ' At SCF step',istep,' vres2 =',res2,' < tolvrs=',tolvrs,' =>converged.'
else
write(message, '(a,a,i5,a,1p,e10.2,a,e10.2,a)' ) ch10,&
& ' At SCF step',istep,' nres2 =',res2,' < tolvrs=',tolvrs,' =>converged.'
end if
call wrtout(ab_out,message,'COLL')
call wrtout(std_out,message,'COLL')
write(*,*)'choice2: res2/tolvrs = ',res2,tolvrs ! ADDED
quit=1
end if
and similarly at the start of choice3
During the loop, the exit criteria was met on 7/8 nodes:
Code: Select all
choice2: res2/tolvrs = 9.666527591608606E-013 1.000000000000000E-012 (7 times)
But after the loop, the exit criteria reported that one node still had the res2 value from the previous loop.
Code: Select all
choice3: res2/tolvrs = 9.666527591608606E-013 1.000000000000000E-012 (7 times)
choice3: res2/tolvrs = 7.382415892658378E-011 1.000000000000000E-012
Additionally, I did not receive the "=>converged" output so I am assuming the node that isn't getting updated is the master node. Any suggestions on how to fix this error is appreciated. Thanks.
Here is my setup:
abinit-7.6.3
intel 12.1 + mkl + mkl-fftw3
mvapich2 1.7 (shared mem)
Code: Select all
../configure \
--prefix=$home/software/abinit/7.6/intel \
--enable-64bit-flags \
--enable-gw-dpc \
--with-linalg-flavor=mkl \
--with-fft-flavor=fftw3-mkl \
--with-mpi-level=2 \
--enable-mpi-io \
FC="mpiifort -mkl=cluster -debug " \
CC="mpiicc -mkl=cluster -debug " \
CXX="mpiicpc -mkl=cluster -debug "
Then I run using:
Code: Select all
srun -n 8 -p pdebug ~/Downloads/abinit-7.6.3/debug/src/98_main/abinit < run.files >& run.log