Page 1 of 1
[SOLVED]Parallel calcs-ground state wave functions bef lrp.
Posted: Fri May 10, 2013 1:29 pm
by uma
I am trying to calculate the phonon frequencies of an insulator using Abinit 7.2.1 parallel version. I am using 24 processors (2 nodes). For the first step, to get the ground state wave functions, I am using the template given in tdfpt_01.in. I get the error message,
ITER STEP NUMBER 1
vtorho : nnsclo_now= 2, note that nnsclo,dbl_nnsclo,istep= 2 0 1
--------------------------------------------------------------------------mpirun noticed that process rank 17 with PID 15235 on node node49 exited on signal 11 (Segmentation fault).
The input file that I used is as below:
acell 3*3.8037 angstrom
# densty 1.2
ecut 500 eV
enunit 2
localrdwf 1
ngkpt 8 8 8
nshiftk 1
shiftk 0.0 0.0 0.0
# 0.5 0.0 0.0
# 0.0 0.5 0.0
# 0.0 0.0 0.5
natom 4
nband 20
nnsclo 2
nline 3
nstep 30
ntypat 1
occopt 1
rprim -0.5000000000000000 0.5000000000000000 0.5000000000000000
0.5000000000000000 -0.5000000000000000 0.5000000000000000
0.5000000000000000 0.5000000000000000 -0.5000000000000000
timopt 2
tnons 72*0.0d0
tolvrs 1.0d-18
typat 1 1 1 1
xred 0.17000000000000017 0.1700000000000017 0.1700000000000017
0.5000000000000000 0.0000000000000000 0.3299999999999983
0.0000000000000000 0.3299999999999983 0.5000000000000000
0.3299999999999983 0.5000000000000000 0.0000000000000000
# irdwfk 0
# istwfk 2
# getwfk 0
znucl 7.0
# This line added when defaults were changed (v5.3) to keep the previous, old behaviour
iscf 5
# add to conserve old < 6.7.2 behavior for calculating forces at each SCF step
optforces 1
paral_kgb 0
Could you please identify my error..?
Uma
Re: Parallel calcs-ground state wave functions before lin. r
Posted: Fri May 10, 2013 6:28 pm
by gabriel.antonius
I manage to run your input file without problem on two processors.
Could you specify more about the compilation options? Do you use mpi-io?
Note that the variables
localrdwf 1
paral_kgb 0
do depend on the type of parallelism that is allowed.
Re: Parallel calcs-ground state wave functions before lin. r
Posted: Mon May 13, 2013 10:34 am
by uma
Dear Gabriel,
Thank you for offering to help. I have given the details as it appears in my output file on running this parallel version. Please let me know where I went wrong.. Manty thanks in advance..
=== Build Information ===
Version : 7.2.1
Build target : x86_64_linux_intel12.0
Build date : 20130426
[color=#FF0000][color=#FF0000] === Compiler Suite ===
C compiler : intel12.0
CFLAGS : -g -O2 -vec-report0
C++ compiler : intel12.0
CXXFLAGS : -g -O2 -vec-report0
Fortran compiler : intel12.0
FCFLAGS : -g -extend-source -vec-report0 -noaltparam -nofpscomp -openmp
FC_LDFLAGS : -static-intel -static-libgcc
=== Optimizations ===
Debug level : basic
Optimization level : standard
Architecture : intel_xeon
=== MPI ===
Parallel build : yes
Parallel I/O : no
Time tracing : no
GPU support : no
=== Connectors / Fallbacks ===
Connectors on : yes
Fallbacks on : yes
DFT flavor : libxc-fallback+bigdft-fallback+wannier90-fallback
FFT flavor : none
LINALG flavor : netlib
MATH flavor : none
TIMER flavor : abinit
TRIO flavor : netcdf-fallback+etsf_io-fallback
=== Experimental features ===
Bindings : no
Exports : no
GW double-precision : no
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Default optimizations:
-O2 -xHost
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
CPP options activated during the build:
CC_INTEL CXX_INTEL FC_INTEL
HAVE_DFT_BIGDFT HAVE_DFT_LIBXC HAVE_DFT_WANNIER90
HAVE_FC_ALLOCATABLE_DT... HAVE_FC_CONTIGUOUS HAVE_FC_CPUTIME
HAVE_FC_ETIME HAVE_FC_EXIT HAVE_FC_FLUSH
HAVE_FC_GAMMA HAVE_FC_GETENV HAVE_FC_GETPID
HAVE_FC_IOMSG HAVE_FC_ISO_C_BINDING HAVE_FC_NULL
HAVE_FC_STREAM_IO HAVE_LINALG HAVE_LINALG_SERIAL
HAVE_MPI HAVE_MPI2 HAVE_MPI_TYPE_CREATE_S...
HAVE_NUMPY HAVE_OMP_COLLAPSE HAVE_OS_LINUX
HAVE_TIMER HAVE_TIMER_ABINIT HAVE_TIMER_MPI
HAVE_TIMER_SERIAL HAVE_TRIO_ETSF_IO HAVE_TRIO_NETCDF
+++++++++++++++++++++++++++++++++++++++++++[/color]
Re: Parallel calcs-ground state wave functions before lin. r
Posted: Mon May 13, 2013 2:23 pm
by gabriel.antonius
Hi,
I can not reproduce the bug. It could be machine-specific (or something as simple as not loading the right modules...)
Here is the procedure I recommend for tracking the bug:
1) Reduce calculation parameter (for example, try "ecut 5.0, nstep 1, ngkpt 2 2 2).
2) Reduce the number of processors, to find the smallest calculation that reproduces the bug on the fewest processors.
3) Instead of using paral_kgb, use npkpt, npfft, npband.
Setting these three to "1" will make a serial run, then you ca increase them to control the parallelism.
Let me know if the bug still shows up.
Re: Parallel calcs-ground state wave functions before lin. r
Posted: Mon May 13, 2013 4:22 pm
by uma
Dear Gabriel,
I ran the code with the following input file. I used 1 node (12 processors).
acell 3*3.8037 angstrom
# densty 1.2
ecut 500 eV
enunit 2
localrdwf 1
ngkpt 2 2 2
nshiftk 1
shiftk 0.0 0.0 0.0
natom 4
nband 20
nnsclo 2
nline 3 nstep 30
ntypat 1
occopt 1
rprim -0.5000000000000000 0.5000000000000000 0.5000000000000000
0.5000000000000000 -0.5000000000000000 0.5000000000000000
0.5000000000000000 0.5000000000000000 -0.5000000000000000
timopt 2
# tnons 72*0.0d0
tolvrs 1.0d-18
typat 1 1 1 1
xred 0.17000000000000017 0.1700000000000017 0.1700000000000017
0.5000000000000000 0.0000000000000000 0.3299999999999983
0.0000000000000000 0.3299999999999983 0.5000000000000000
0.3299999999999983 0.5000000000000000 0.0000000000000000
znucl 7.0
# This line added when defaults were changed (v5.3) to keep the previous, old behaviour
iscf 5
# add to conserve old < 6.7.2 behavior for calculating forces at each SCF step
optforces 1
# paral_kgb 0
diemac 2.0
npkpt 1
npfft 1
npband1
I got the error message:
P-0000 --- cgwf is called for band 1 for 3 lines
-P-0000 --- cgwf is called for band 2 for 3 lines
-P-0000 --- cgwf is called for band 3 for 3 lines
-P-0000 --- cgwf is called for band 4 for 3 lines
-P-0000 --- cgwf is called for band 5 for 3 lines
--------------------------------------------------------------------------
mpirun noticed that process rank 1 with PID 15673 on node node60 exited on signal 11 (Segmentation fault).
--------------------------------------------------------------------------
-P-0000 res: 1.44E-01 1.71E-01 1.30E-01 1.02E-01 3.22E-02
-P-0000 ene: 2.16E-02 5.26E-01 7.55E-01 8.18E-01 9.18E-01
-P-0000 --- cgwf is called for band 1 for 3 lines
Parallel calcs-ground state wave functions before lin. resp.
Posted: Tue May 14, 2013 5:19 pm
by uma
Dear Gabriel,
Many thanks. The problem was solved when I recompiled using mpi-io ='yes'.
Uma