how to speed up abinit runs ?

hicpalm · Post by **hicpalm** » Tue Oct 12, 2010 10:30 am

hello,
I am trying to optimize the speed of my runs with abinit so I have tried (cpu intel I5):

1- gfortran 4.3, openmpi 1.3.3 or openmpi 1.5
2- ifort 11.1.064, openmpi 1.3.3 or openmpi 1.5

intel compiler is giving up to 30 % speed up with respect to gfortran; the problem is that for some runs I have a seg. fault with ifort.

what would you advice me ? especially for gfortran, which seems stable for me. should I add some librairies like fftw3 or something other ?

thanks.

jzwanzig · Post by **jzwanzig** » Tue Oct 12, 2010 12:29 pm

I get the best results on Intel chips using ifort + mkl. I think mkl is the most important ingredient here. Also, if you use ifort, don't use -O3, use -O2. If you stay with gforttran, you can use -O3 but it sill won't be as fast as the ifort code. I would not advise using fftw, I don't think that is in production yet as an option (it's still experimental). As to openmpi, you won't gain speed from the version you use, but be sure you have compiled the version (I would recommend 1.4.3) with the exact compilers you will also build abinit with.

Alain_Jacques · Post by **Alain_Jacques** » Tue Oct 12, 2010 3:34 pm

Same story here, ifort+MKL gives the best overall results. I compile abinit with -xHOST -O3 -ip Intel compiler option and produce stable binaries - with MY input files. If Jo says that O2 is safer, trust his experience. Hicpalm, if you have reproducible segfaults with ifort, more details would be interesting. Install latest 11.1.073 version that correct a fair provision of bugs. Avoid no-prec-div or similar fast-math options; they have adverse effects on accuracy with Abinit but marginal speed improvement.

If gfortran is more convenient for you, you can benefit from both worlds by compiling with MKL as linear algebra libraries (google for "MKL link advisor" for the right recipe). FFTW is somewhat experimental - but tested. You can compile with FFTW support (through MKL with --with-fft-flavor=fftw3-mkl configure flag) and decide at runtime which algorithm to use with abinit "fftalg" variable. FFTW3-MKL is several times faster than Goedeker FFT in GW routines - your mileage may vary.

Stupid optimization trick - 64 bit is significantly faster than 32 bit binaries.

Kind regards,

Alain

hicpalm · Post by **hicpalm** » Wed Oct 13, 2010 11:50 am

thank you for these very detailed and interesting advices.
alain, for seg. faults I will give feedback as soon as possible.
regards.

hicpalm · Post by **hicpalm** » Thu Oct 14, 2010 12:24 pm

I have tested all the suggestions, but nothing seems to prevent the segmentation fault.

here is a an input file which should reproduce the problem (just an example):

Code: Select all

acell    1.4042031646E+01  4.2422025970E+01  1.5293909280E+01
ntypat 1
znucl 20
natom 52 
typat   52*1
      xred    9.8442753484E-01  1.6280951133E-01  8.2146585332E-01
              1.4790876806E-01  1.4387041496E-01  3.1803276341E-01
              3.3431738442E-01  5.1876048510E-01  1.7797054422E-01
              3.1563534693E-01  2.5000000000E-01  1.6937611254E-02
              3.0748376721E-03  5.4782512042E-02  4.2041279381E-02
              1.8274416369E-01  5.9779283066E-01  4.7299926875E-01
              1.9101708773E-01  2.5000000000E-01  6.0548647487E-01
              4.2481957517E-01  2.5000000000E-01  3.7570822010E-01
              5.1557246516E-01  8.3719048867E-01  3.2146585332E-01
              3.5209123194E-01  8.5612958504E-01  8.1803276341E-01
              1.6568261558E-01  4.8123951490E-01  6.7797054422E-01
              1.8436465307E-01  7.5000000000E-01  5.1693761125E-01
              4.9692516233E-01  9.4521748796E-01  5.4204127938E-01
              3.1725583631E-01  4.0220716934E-01  9.7299926875E-01
              3.0898291227E-01  7.5000000000E-01  1.0548647487E-01
              7.5180424833E-02  7.5000000000E-01  8.7570822010E-01
              1.5572465163E-02  6.6280951133E-01  1.7853414668E-01
              8.5209123194E-01  6.4387041496E-01  6.8196723659E-01
              6.6568261558E-01  1.8760485096E-02  8.2202945578E-01
              6.8436465307E-01  7.5000000000E-01  9.8306238875E-01
              9.9692516233E-01  5.5478251204E-01  9.5795872062E-01
              8.1725583631E-01  9.7792830659E-02  5.2700073125E-01
              8.0898291227E-01  7.5000000000E-01  3.9451352513E-01
              5.7518042483E-01  7.5000000000E-01  6.2429177990E-01
              4.8442753484E-01  3.3719048867E-01  6.7853414668E-01
              6.4790876806E-01  3.5612958504E-01  1.8196723659E-01
              8.3431738442E-01  9.8123951490E-01  3.2202945578E-01
              8.1563534693E-01  2.5000000000E-01  4.8306238875E-01
              5.0307483767E-01  4.4521748796E-01  4.5795872062E-01
              6.8274416369E-01  9.0220716934E-01  2.7000731249E-02
              6.9101708773E-01  2.5000000000E-01  8.9451352513E-01
              9.2481957517E-01  2.5000000000E-01  1.2429177990E-01
              1.5572465163E-02  8.3719048867E-01  1.7853414668E-01
              8.5209123194E-01  8.5612958504E-01  6.8196723659E-01
              6.6568261558E-01  4.8123951490E-01  8.2202945578E-01
              9.9692516233E-01  9.4521748796E-01  9.5795872062E-01
              8.1725583631E-01  4.0220716934E-01  5.2700073125E-01
              4.8442753484E-01  1.6280951133E-01  6.7853414668E-01
              6.4790876806E-01  1.4387041496E-01  1.8196723659E-01
              8.3431738442E-01  5.1876048510E-01  3.2202945578E-01
              5.0307483767E-01  5.4782512042E-02  4.5795872062E-01
              6.8274416369E-01  5.9779283066E-01  2.7000731249E-02
              9.8442753484E-01  3.3719048867E-01  8.2146585332E-01
              1.4790876806E-01  3.5612958504E-01  3.1803276341E-01
              3.3431738442E-01  9.8123951490E-01  1.7797054422E-01
              3.0748376721E-03  4.4521748796E-01  4.2041279381E-02
              1.8274416369E-01  9.0220716934E-01  4.7299926875E-01
              5.1557246516E-01  6.6280951133E-01  3.2146585332E-01
              3.5209123194E-01  6.4387041496E-01  8.1803276341E-01
              1.6568261558E-01  1.8760485096E-02  6.7797054422E-01
              4.9692516233E-01  5.5478251204E-01  5.4204127938E-01
              3.1725583631E-01  9.7792830659E-02  9.7299926875E-01
 

 nband 290
 ecut 0.9
 pawecutdg 10
 tsmear 0.1 eV
 occopt 4
 kptopt 1
 kptrlatt    8  0  0   0  3  0   0  0  8
 toldfe 1.0d-8

ecut is set to 0.9 to speed up the run. At the end of the first Iteration I have

Code: Select all

 ITER STEP NUMBER     1
 vtorho : nnsclo_now=  2, note that nnsclo,dbl_nnsclo,istep=  0 0  1
-P-0000  leave_test : synchronization done...
 vtorho: loop on k-points and spins done in parallel
--------------------------------------------------------------------------
mpirun noticed that process rank 0 with PID 10261 on node ab-i5 exited on signal 11 (Segmentation fault).
--------------------------------------------------------------------------

I should say that if I switch to a one k-point grid with kptopt=0, or completely disable the smearing, the seg fault desapears.

Post by **pouillon** » Fri Oct 15, 2010 11:38 am

If you're using ifort 10.1 or 11.1, you should make sure you're using the very latest build, by checking on the Intel website. It is important to have the latest bugfixes available. You should also make sure that MPI has been compiled with the same version.

I improved the support for Intel compilers in Abinit 6.4, and I ran into all sorts of problems with some releases of ifort 10.1 and 11.1. It's only after installing the latest releases that I could successfully use Abinit.

A possible alternative is to use gfortran 4.5. Jean-Michel and I added support for this version in the Abinit build system, and noticed a remarkable speed-up for some calculations (up to 3 times faster) when activating vectorization. I think that, with carefully-chosen optimizations, the performances of gfortran will soon compare to the ones of ifort. Personally, I don't care being 10-15% slower when the stability is orders of magnitude better.

hicpalm · Post by **hicpalm** » Thu Oct 21, 2010 10:40 am

indeed, using the latest intel compiler version solves the seg fault problem.
thanks.

ABINIT Discussion Forums

how to speed up abinit runs ?

how to speed up abinit runs ?

Re: how to speed up abinit runs ?

Re: how to speed up abinit runs ?

Re: how to speed up abinit runs ?

Re: how to speed up abinit runs ?

Re: how to speed up abinit runs ?

Re: how to speed up abinit runs ?