Page 1 of 1

Issue Regarding Parallel Installation of Abinit 8.0.7

Posted: Tue Jul 12, 2016 8:06 am
by Esha
Hi. I have followed this video to install Abinit-8.0.7: http://www.youtube.com/watch?v=DppLQ-KQA68

My sponce.ac file is as follows

enable_mpi="yes"
enable_mpi_io="yes"
with_mpi_prefix="/usr"
with_trio_flavor="netcdf+etsf_io"
with_netcdf_incs="-I/usr/include"
with_netcdf_libs="-L/usr/lib -lnetcdf -lnetcdff"
with_fft_flavor="fftw3"
with_fft_incs="-I/usr/include/"
with_fft_libs="-L/usr/lib/x86_64-linux-gnu/ -lfftw3 -lfftw3f"
with_linalg_flavor="atlas"
with_linalg_libs="-L/usr/lib -llapack -lf77blas -lcblas -latlas"
with_dft_flavor="atompaw+libxc"
#with_dft_flavor="atompaw+bigdft+libxc+wannier90"
enable_gw_dpc="yes"
with_mpi_level="2"
FC="/usr/bin/mpif90"
CC="/usr/bin/mpicc"
CXX="/usr/bin/mpic++"

I then configure abinit using command from inside build folder
../configure --with-config-file="./sponce.ac"

it ran succesfully then I make abinit using command
make multi multi_nprocs=8

then
sudo make install

then I submit the job using command
mpirun -np 8 /usr/local/bin/abinit < BaO-trf2-1.files >& RUN.log

The job doesnt run at all. It gave me signal 7 bus error

I tried again with minimum no of cores
mpirun -np 2 /usr/local/bin/abinit < BaO-trf2-1.files >& RUN.log

It ran a little bit and then again the same error

I make it again using command
make multi multi_nprocs=10
sudo make install

Now it is running with command
mpirun -np 4 /usr/local/bin/abinit < BaO-trf2-1.files >& RUN.log
but taking too long. It seems not running on parallel cores

Inside log file I noticed one issue

--- !WARNING
src_file: m_nctk.F90
src_line: 539
message: |
The netcdf library does not support parallel IO, see message above
Abinit won't be able to produce files in parallel e.g. when paral_kgb==1 is used.
Action: install a netcdf4+HDF5 library with MPI-IO support.

Is it the reason? or anything else?
How to resolve the issue? Any help will be appreciated.

Re: Issue Regarding Parallel Installation of Abinit 8.0.7

Posted: Tue Jul 19, 2016 9:04 am
by Jordan
Hi,

The compilation should not depend on the number of cores you use to compile abinit. make or make mj4 or whatever should result in the same executable.

The last warning you have seems to be related to the fact that you link with netcdf but maybe not with the hdf5 version of netcdf. (the tutorial you follow is a little bit old but nevermind)

Can you at least provide the error message you have instead of just signal 7 bus error ? Can you run any other job with mpi ? Try to be more specific.

Cheers

Re: Issue Regarding Parallel Installation of Abinit 8.0.7

Posted: Wed Jul 20, 2016 8:28 am
by Esha
Hi,

Thanks for your response, the complete error message in log file is

Program received signal SIGBUS: Access to an undefined portion of a memory object.

Backtrace for this error:
#0 0x7FCE9356B777
#1 0x7FCE9356BD7E
#2 0x7FCE92A89CAF
#3 0x7FCE8458233A
#4 0x7FCE916584E8
#5 0x7FCE91658807
#6 0x7FCE8457FB53
#7 0x7FCE84FCD75C
#8 0x7FCE853D8F1A
#9 0x7FCE9166FB54
#10 0x7FCE9168665F
#11 0x7FCE938930F7
#12 0x12B33CA in __m_xmpi_MOD_xmpi_init at m_xmpi.F90:601
#13 0x40F9BE in abinit at abinit.F90:215
--------------------------------------------------------------------------
mpirun noticed that process rank 0 with PID 13645 on node hachemi exited on signal 7 (Bus error).
--------------------------------------------------------------------------

Re: Issue Regarding Parallel Installation of Abinit 8.0.7

Posted: Tue Aug 23, 2016 1:00 pm
by pouillon
This is a problem with your MPI installation, not with Abinit. Please consult their documentation / forums / mailing lists and/or re-install MPI.

Re: Issue Regarding Parallel Installation of Abinit 8.0.7

Posted: Mon Oct 03, 2016 2:39 pm
by marco.digennaro
Hi guys,
I actually have the same problem. Even though parallelism is there (cpu time different for different mpi runs), I get these two warning messages:

Code: Select all

 --- !WARNING
src_file: m_nctk.F90
src_line: 526
message: |
     Strange, netcdf seems to support MPI-IO but: NetCDF: Not a valid ID
 ...
 
 --- !WARNING
 src_file: m_nctk.F90
 src_line: 539
 message: |
     The netcdf library does not support parallel IO, see message above
     Abinit won't be able to produce files in parallel e.g. when paral_kgb==1 is used.
     Action: install a netcdf4+HDF5 library with MPI-IO support.
 ...


I re-installed netcdf and hdf5 within anaconda, and re-installed abinit8 right after, but the problem is still there.
I also tried to modify the line `` with-trio-flavor="netcdf+etsf_io" `` in the .ac file and noticed that in the end the configuration does not care if you set netcdf, or netcdf+whatever, since TRIO flavor is set to None. You have to type it by hand after configure to get it correctly.
This looks a bit suspicious to me.

cheers

Re: Issue Regarding Parallel Installation of Abinit 8.0.7

Posted: Tue Oct 04, 2016 9:35 am
by jbeuken
Hi Marco,

I also tried to modify the line `` with-trio-flavor="netcdf+etsf_io" `` in the .ac file and noticed that in the end the configuration does not care if you set netcdf, or netcdf+whatever, since TRIO flavor is set to None. You have to type it by hand after configure to get it correctly.


I don't know if a typo when you write your post but it's

with_trio_flavor not with-trio-flavor

jmb

Re: Issue Regarding Parallel Installation of Abinit 8.0.7

Posted: Tue Oct 04, 2016 3:23 pm
by marco.digennaro
Thanks Jean Michel,

that is absolutely right. But the warning regarding netcdf is still there.

BR