mpirun has exited due to process rank 4 abinit-7.6.3

Total energy, geometry optimization, DFT+U, spin....

Moderator: bguster

Locked
Chem
Posts: 17
Joined: Thu May 24, 2012 12:17 pm

mpirun has exited due to process rank 4 abinit-7.6.3

Post by Chem » Sat Apr 19, 2014 11:11 pm

Dear All

I have succesfully installed abinit-7.6.3 with :
prefix="/home/ipc/bin"
enable_openmp="yes"
enable_mpi="yes"
enable_mpi_io="yes"
with_mpi_prefix="/usr/local/openmpi-1.6.i13"
with_fft_flavor="fftw3"
with_fft_libs="-L/opt/intel/composer_xe_2013.2.146/mkl/lib/intel64 -Wl,--start-group -lmkl_intel_lp64 -lmkl_sequential -lmkl_core -Wl,--end-group"
with_linalg_flavor="mkl"
with_linalg_libs="-L/opt/intel/composer_xe_2013.2.146/mkl/lib/intel64 -Wl,--start-group -lmkl_intel_lp64 -lmkl_sequential -lmkl_core -Wl,--end-group "
with_dft_flavor="atompaw+bigdft+libxc+wannier90"
with_trio_flavor="netcdf+etsf_io+fox"

enable_gw_dpc="yes"
enable_test_timeout="yes"

I calculated the Screening W in the GW part, but when doing the Sigma calculation, an error appear :
kpt= ( 5.00000000E-01 5.00000000E-01 5.00000000E-01) spin= 1:
ib vxc vxcval vhartree
8 -20.69528 -20.69528 71.76299
9 -21.47060 -21.47060 77.47514
10 -21.47060 -21.47060 77.47514
11 -19.28888 -19.28888 56.38010
12 -19.28888 -19.28888 56.38010
13 -19.28888 -19.28888 56.38010
14 -22.87249 -22.87249 72.66752
15 -22.87249 -22.87249 72.66752
16 -22.87249 -22.87249 72.66752
17 -20.24221 -20.24221 84.66368
18 -20.24221 -20.24221 84.66368
19 -20.24221 -20.24221 84.66368
20 -22.75802 -22.75802 97.20710
21 -22.75802 -22.75802 97.20710
22 -4.80397 -4.80397 -35.62484
23 -10.09755 -10.09755 11.72995
24 -10.09755 -10.09755 11.72995
25 -10.09755 -10.09755 11.72995
Er%ID: 4
Er%Hscr%ID: 4
Memory needed for Er%epsm1 = 627.9 [Mb]
mkdump_Erread_screening with MPI_IO
Killed
Imaginary frequency for fit located at: 26.1527 [eV]
--------------------------------------------------------------------------
mpirun has exited due to process rank 4 with PID 8834 on
node hpc-n265 exiting improperly. There are two reasons this could occur:

1. this process did not call "init" before exiting, but others in
the job did. This can cause a job to hang indefinitely while it waits
for all processes to call "init". By rule, if one process calls "init",
then ALL processes must call "init" prior to termination.

2. this process called "init", but exited without calling "finalize".
By rule, all processes that call "init" MUST call "finalize" prior to
exiting or it will be considered an "abnormal termination"

This may have caused other processes in the application to be
terminated by signals sent by mpirun (as reported here).

The code crashes even within gwmem 0, but when reducing the number of bands and the cutoff energy, the calculations are well completed.
BTW: number of nodes :4 and number of procs/node : 16
Can someone help me to overcome this problem.

Kind regards.

User avatar
gmatteo
Posts: 291
Joined: Sun Aug 16, 2009 5:40 pm

Re: mpirun has exited due to process rank 4 abinit-7.6.3

Post by gmatteo » Sat Apr 19, 2014 11:16 pm

Could you post the input file and the log file of the run?

How many MPI processes and how many OpenMP threads are you using?
Note that if the env variables OMP_NUM_THREADS is not defined, the OMP runtime library
will use all the CPUS available!

Chem
Posts: 17
Joined: Thu May 24, 2012 12:17 pm

Re: mpirun has exited due to process rank 4 abinit-7.6.3

Post by Chem » Sat Apr 19, 2014 11:38 pm

Thank you gmatteo for your prompt answer.

I am not using OPENMP Threads;just :
module load mpi/openmpi-1.6.i13
module load compilers/intel13
module load libs/mkl13
module load libs/fftw3

I cannot upload the log file, but the code stops after reading the Matrix elements in the KS basis set.
Cheers.
Last edited by Chem on Wed Apr 23, 2014 11:43 am, edited 2 times in total.

User avatar
gmatteo
Posts: 291
Joined: Sun Aug 16, 2009 5:40 pm

Re: mpirun has exited due to process rank 4 abinit-7.6.3

Post by gmatteo » Sat Apr 19, 2014 11:44 pm

I am not using OPENMP Threads;just :


Actually you are because you are using

Code: Select all

enable_openmp="yes"


in the configuration file.

Set the number of threads to 1 with:

$ export OMP_NUM_THREADS=1

Increase the stack size with

$ ulimit -s unlimited

and try to rerun the calculation with mpirun

Chem
Posts: 17
Joined: Thu May 24, 2012 12:17 pm

Re: mpirun has exited due to process rank 4 abinit-7.6.3

Post by Chem » Sun Apr 20, 2014 12:05 am

Unfortunately? I have the same error in the same step even with :
export OMP_NUM_THREADS=1
ulimit -s unlimited

is there another solution?

User avatar
gmatteo
Posts: 291
Joined: Sun Aug 16, 2009 5:40 pm

Re: mpirun has exited due to process rank 4 abinit-7.6.3

Post by gmatteo » Sun Apr 20, 2014 12:13 am

Disable OMP:

enable_openmp="no"

and recompile the code from scratch:

make clean && make -j4

You are linking against the sequential version of MKL thus it does not make
sense to enable OpenMP only in abinit.

Rerun the calculation and let me know if this solves the problem.

I cannot upload the log file


change the extension of the file (.F90 should work if I remember correctly)

Chem
Posts: 17
Joined: Thu May 24, 2012 12:17 pm

Re: mpirun has exited due to process rank 4 abinit-7.6.3

Post by Chem » Sun Apr 20, 2014 1:44 am

Dear gmatteo

I disabled the OPENMP as you proposed and I recompiled the code, but the problem persists.
attached my log file:

Kind regards.
Attachments
log.f90
(165.35 KiB) Downloaded 308 times

User avatar
jbeuken
Posts: 365
Joined: Tue Aug 18, 2009 9:24 pm
Contact:

Re: mpirun has exited due to process rank 4 abinit-7.6.3

Post by jbeuken » Sun Apr 20, 2014 7:49 pm

Hi,

theses two versions of intel 13

13.0.1.117 ( 2013.0.028 )
or
13.1.3 ( 2013.5.192 )

are known as compiling a "good" binary

13.1.0, 13.1.1, 13.1.2 are not

regards

my 5¢
------
Jean-Michel Beuken
Computer Scientist

Locked