Page 1 of 1

mpirun has exited due to process rank 4 abinit-7.6.3

Posted: Sat Apr 19, 2014 11:11 pm
by Chem
Dear All

I have succesfully installed abinit-7.6.3 with :
prefix="/home/ipc/bin"
enable_openmp="yes"
enable_mpi="yes"
enable_mpi_io="yes"
with_mpi_prefix="/usr/local/openmpi-1.6.i13"
with_fft_flavor="fftw3"
with_fft_libs="-L/opt/intel/composer_xe_2013.2.146/mkl/lib/intel64 -Wl,--start-group -lmkl_intel_lp64 -lmkl_sequential -lmkl_core -Wl,--end-group"
with_linalg_flavor="mkl"
with_linalg_libs="-L/opt/intel/composer_xe_2013.2.146/mkl/lib/intel64 -Wl,--start-group -lmkl_intel_lp64 -lmkl_sequential -lmkl_core -Wl,--end-group "
with_dft_flavor="atompaw+bigdft+libxc+wannier90"
with_trio_flavor="netcdf+etsf_io+fox"

enable_gw_dpc="yes"
enable_test_timeout="yes"

I calculated the Screening W in the GW part, but when doing the Sigma calculation, an error appear :
kpt= ( 5.00000000E-01 5.00000000E-01 5.00000000E-01) spin= 1:
ib vxc vxcval vhartree
8 -20.69528 -20.69528 71.76299
9 -21.47060 -21.47060 77.47514
10 -21.47060 -21.47060 77.47514
11 -19.28888 -19.28888 56.38010
12 -19.28888 -19.28888 56.38010
13 -19.28888 -19.28888 56.38010
14 -22.87249 -22.87249 72.66752
15 -22.87249 -22.87249 72.66752
16 -22.87249 -22.87249 72.66752
17 -20.24221 -20.24221 84.66368
18 -20.24221 -20.24221 84.66368
19 -20.24221 -20.24221 84.66368
20 -22.75802 -22.75802 97.20710
21 -22.75802 -22.75802 97.20710
22 -4.80397 -4.80397 -35.62484
23 -10.09755 -10.09755 11.72995
24 -10.09755 -10.09755 11.72995
25 -10.09755 -10.09755 11.72995
Er%ID: 4
Er%Hscr%ID: 4
Memory needed for Er%epsm1 = 627.9 [Mb]
mkdump_Erread_screening with MPI_IO
Killed
Imaginary frequency for fit located at: 26.1527 [eV]
--------------------------------------------------------------------------
mpirun has exited due to process rank 4 with PID 8834 on
node hpc-n265 exiting improperly. There are two reasons this could occur:

1. this process did not call "init" before exiting, but others in
the job did. This can cause a job to hang indefinitely while it waits
for all processes to call "init". By rule, if one process calls "init",
then ALL processes must call "init" prior to termination.

2. this process called "init", but exited without calling "finalize".
By rule, all processes that call "init" MUST call "finalize" prior to
exiting or it will be considered an "abnormal termination"

This may have caused other processes in the application to be
terminated by signals sent by mpirun (as reported here).

The code crashes even within gwmem 0, but when reducing the number of bands and the cutoff energy, the calculations are well completed.
BTW: number of nodes :4 and number of procs/node : 16
Can someone help me to overcome this problem.

Kind regards.

Re: mpirun has exited due to process rank 4 abinit-7.6.3

Posted: Sat Apr 19, 2014 11:16 pm
by gmatteo
Could you post the input file and the log file of the run?

How many MPI processes and how many OpenMP threads are you using?
Note that if the env variables OMP_NUM_THREADS is not defined, the OMP runtime library
will use all the CPUS available!

Re: mpirun has exited due to process rank 4 abinit-7.6.3

Posted: Sat Apr 19, 2014 11:38 pm
by Chem
Thank you gmatteo for your prompt answer.

I am not using OPENMP Threads;just :
module load mpi/openmpi-1.6.i13
module load compilers/intel13
module load libs/mkl13
module load libs/fftw3

I cannot upload the log file, but the code stops after reading the Matrix elements in the KS basis set.
Cheers.

Re: mpirun has exited due to process rank 4 abinit-7.6.3

Posted: Sat Apr 19, 2014 11:44 pm
by gmatteo
I am not using OPENMP Threads;just :


Actually you are because you are using

Code: Select all

enable_openmp="yes"


in the configuration file.

Set the number of threads to 1 with:

$ export OMP_NUM_THREADS=1

Increase the stack size with

$ ulimit -s unlimited

and try to rerun the calculation with mpirun

Re: mpirun has exited due to process rank 4 abinit-7.6.3

Posted: Sun Apr 20, 2014 12:05 am
by Chem
Unfortunately? I have the same error in the same step even with :
export OMP_NUM_THREADS=1
ulimit -s unlimited

is there another solution?

Re: mpirun has exited due to process rank 4 abinit-7.6.3

Posted: Sun Apr 20, 2014 12:13 am
by gmatteo
Disable OMP:

enable_openmp="no"

and recompile the code from scratch:

make clean && make -j4

You are linking against the sequential version of MKL thus it does not make
sense to enable OpenMP only in abinit.

Rerun the calculation and let me know if this solves the problem.

I cannot upload the log file


change the extension of the file (.F90 should work if I remember correctly)

Re: mpirun has exited due to process rank 4 abinit-7.6.3

Posted: Sun Apr 20, 2014 1:44 am
by Chem
Dear gmatteo

I disabled the OPENMP as you proposed and I recompiled the code, but the problem persists.
attached my log file:

Kind regards.

Re: mpirun has exited due to process rank 4 abinit-7.6.3

Posted: Sun Apr 20, 2014 7:49 pm
by jbeuken
Hi,

theses two versions of intel 13

13.0.1.117 ( 2013.0.028 )
or
13.1.3 ( 2013.5.192 )

are known as compiling a "good" binary

13.1.0, 13.1.1, 13.1.2 are not

regards

my 5ยข