trouble with openmpi+(icc+ifort) when using paral_kgb

Total energy, geometry optimization, DFT+U, spin....

Moderator: bguster

Locked
ppy
Posts: 24
Joined: Wed May 19, 2010 6:28 pm

trouble with openmpi+(icc+ifort) when using paral_kgb

Post by ppy » Fri Oct 26, 2012 2:58 pm

Dear abinit users and developpers,
I am trying to benchmark abinit on my station (16 cores). So I did several tests.
I used abinit 6.12.3
- opempi 1.6.2 with gcc 4.7.2 (without particular configuration options except openmpi): everything runs perfectly
- same + mkl library for fftw and linalgo : _ perfectly (even a bit faster)

- openmpi 1.6.2 with intel 2013 + mkl library : here I start to have some troubles.
Actually I was able to run the test tY.in where the paral_kgb is tested so I though that everything was fine. But with my particular file (where I optimize a crystal structure) from which I am doing the benchmark, the job crashes as soon as it starts scf calculation without any notice in the log file and with this error message:
forrtl: error (78): process killed (SIGTERM)

I have to say that in comparison to the tY.in test file, in my benchmark file the cell is much larger nbands 56, ecut 140 Ha,
At the beginning I was using npband 6 and nkpt 2 (there is only 2 kpt in the reduced cell), then I realize that it crashed systematically when I was using npband.
So I decided to use npfft instead (to check what happens), it seems to work but this was not satisfying since it turns out that after some iteration the scf cycle didn't converge.
The other thing is that I used GGA instead of LDA used in the test, I will check if that makes a difference.

If some of you have suggestions I would be very much appreciated it.
Thanks a lot
Pierre-Yves

PS: I am wondering if it would be possible to get the developing abinit 7.0 to see if I got the same error?... thanks.

ppy
Posts: 24
Joined: Wed May 19, 2010 6:28 pm

Re: trouble with openmpi+(icc+ifort) when using paral_kgb

Post by ppy » Fri Oct 26, 2012 4:43 pm

I tried compiling with intel without mkl libraries but same problem.
I tried also with LDA instead of GGA...
In view of what Yann Pouillon suggest in this other post "HALF SOLVED] abinit6.12.3 crashes in more than 1 node", I will try another version of intel compilers (I used 13.0.). I will tell you what then...

ppy
Posts: 24
Joined: Wed May 19, 2010 6:28 pm

Re: trouble with openmpi+(icc+ifort) when using paral_kgb

Post by ppy » Mon Oct 29, 2012 12:14 pm

The problem now is that I have gcc 4.7 with which intel 12.1 doesn't seem to be compatible with...
http://software.intel.com/en-us/forums/topic/277969
So I will have to wait for a new version of intel compilers...

ppy
Posts: 24
Joined: Wed May 19, 2010 6:28 pm

Re: trouble with openmpi+(icc+ifort) when using paral_kgb

Post by ppy » Fri Nov 09, 2012 9:01 am

Finally I tried to compile by installing gnu compiler 4.6 first, then intel compiler 12.1, following that openmpi 1.5.4
I worked for intel compilers 4.6.3 but then when I run my abinit-6.12.3, it simply doesn't converge in my scf cycle.
Then I tried to compile with intel 12.1.4, but I got the following error message:

Code: Select all

EEEEEE : catastrophic error: **Internal compiler error: segmentation violation signal raised** Please report this error along with the circumstances in which it occurred in a Software Problem Report.  Note: File and line given may not be explicit cause of this error.
compilation aborted for nmsq_pure_gkk_sumfs.F90 (code 1)


Same as in the post "Compiling error for 77_ddb/nmsq_pure_gkk_sumfs.F90 in SUSE"
Then with intel 12.1.5 the same error.
As I said in my first post I also tried intel 13.0, it compile but does not converge during SCF cycle.
I am on opensuse 12.1, do you know if with another linux OS it could be better??
Do you know for a relaxation calculation the gain in time with intel compiler in comparison with GNU compiler. My cpu are Intel(R) Xeon(R) CPU E5-2650 0 @ 2.00GHz.

Thanks in advance for any answer.
PY

ppy
Posts: 24
Joined: Wed May 19, 2010 6:28 pm

Re: trouble with openmpi+(icc+ifort) when using paral_kgb

Post by ppy » Fri Nov 09, 2012 2:54 pm

I tried intel 12.1.7, it compiles but does not converge during an scf cycle...

Locked