trouble with openmpi+(icc+ifort) when using paral_kgb
Posted: Fri Oct 26, 2012 2:58 pm
Dear abinit users and developpers,
I am trying to benchmark abinit on my station (16 cores). So I did several tests.
I used abinit 6.12.3
- opempi 1.6.2 with gcc 4.7.2 (without particular configuration options except openmpi): everything runs perfectly
- same + mkl library for fftw and linalgo : _ perfectly (even a bit faster)
- openmpi 1.6.2 with intel 2013 + mkl library : here I start to have some troubles.
Actually I was able to run the test tY.in where the paral_kgb is tested so I though that everything was fine. But with my particular file (where I optimize a crystal structure) from which I am doing the benchmark, the job crashes as soon as it starts scf calculation without any notice in the log file and with this error message:
forrtl: error (78): process killed (SIGTERM)
I have to say that in comparison to the tY.in test file, in my benchmark file the cell is much larger nbands 56, ecut 140 Ha,
At the beginning I was using npband 6 and nkpt 2 (there is only 2 kpt in the reduced cell), then I realize that it crashed systematically when I was using npband.
So I decided to use npfft instead (to check what happens), it seems to work but this was not satisfying since it turns out that after some iteration the scf cycle didn't converge.
The other thing is that I used GGA instead of LDA used in the test, I will check if that makes a difference.
If some of you have suggestions I would be very much appreciated it.
Thanks a lot
Pierre-Yves
PS: I am wondering if it would be possible to get the developing abinit 7.0 to see if I got the same error?... thanks.
I am trying to benchmark abinit on my station (16 cores). So I did several tests.
I used abinit 6.12.3
- opempi 1.6.2 with gcc 4.7.2 (without particular configuration options except openmpi): everything runs perfectly
- same + mkl library for fftw and linalgo : _ perfectly (even a bit faster)
- openmpi 1.6.2 with intel 2013 + mkl library : here I start to have some troubles.
Actually I was able to run the test tY.in where the paral_kgb is tested so I though that everything was fine. But with my particular file (where I optimize a crystal structure) from which I am doing the benchmark, the job crashes as soon as it starts scf calculation without any notice in the log file and with this error message:
forrtl: error (78): process killed (SIGTERM)
I have to say that in comparison to the tY.in test file, in my benchmark file the cell is much larger nbands 56, ecut 140 Ha,
At the beginning I was using npband 6 and nkpt 2 (there is only 2 kpt in the reduced cell), then I realize that it crashed systematically when I was using npband.
So I decided to use npfft instead (to check what happens), it seems to work but this was not satisfying since it turns out that after some iteration the scf cycle didn't converge.
The other thing is that I used GGA instead of LDA used in the test, I will check if that makes a difference.
If some of you have suggestions I would be very much appreciated it.
Thanks a lot
Pierre-Yves
PS: I am wondering if it would be possible to get the developing abinit 7.0 to see if I got the same error?... thanks.