Issue Regarding Parallel Installation of Abinit 8.0.7
Moderators: fgoudreault, mcote
Forum rules
Please have a look at ~abinit/doc/config/build-config.ac in the source package for detailed and up-to-date information about the configuration of Abinit 8 builds.
For a video explanation on how to build Abinit 7.x for Linux, please go to: http://www.youtube.com/watch?v=DppLQ-KQA68.
IMPORTANT: when an answer solves your problem, please check the little green V-like button on its upper-right corner to accept it.
Please have a look at ~abinit/doc/config/build-config.ac in the source package for detailed and up-to-date information about the configuration of Abinit 8 builds.
For a video explanation on how to build Abinit 7.x for Linux, please go to: http://www.youtube.com/watch?v=DppLQ-KQA68.
IMPORTANT: when an answer solves your problem, please check the little green V-like button on its upper-right corner to accept it.
Issue Regarding Parallel Installation of Abinit 8.0.7
Hi. I have followed this video to install Abinit-8.0.7: http://www.youtube.com/watch?v=DppLQ-KQA68
My sponce.ac file is as follows
enable_mpi="yes"
enable_mpi_io="yes"
with_mpi_prefix="/usr"
with_trio_flavor="netcdf+etsf_io"
with_netcdf_incs="-I/usr/include"
with_netcdf_libs="-L/usr/lib -lnetcdf -lnetcdff"
with_fft_flavor="fftw3"
with_fft_incs="-I/usr/include/"
with_fft_libs="-L/usr/lib/x86_64-linux-gnu/ -lfftw3 -lfftw3f"
with_linalg_flavor="atlas"
with_linalg_libs="-L/usr/lib -llapack -lf77blas -lcblas -latlas"
with_dft_flavor="atompaw+libxc"
#with_dft_flavor="atompaw+bigdft+libxc+wannier90"
enable_gw_dpc="yes"
with_mpi_level="2"
FC="/usr/bin/mpif90"
CC="/usr/bin/mpicc"
CXX="/usr/bin/mpic++"
I then configure abinit using command from inside build folder
../configure --with-config-file="./sponce.ac"
it ran succesfully then I make abinit using command
make multi multi_nprocs=8
then
sudo make install
then I submit the job using command
mpirun -np 8 /usr/local/bin/abinit < BaO-trf2-1.files >& RUN.log
The job doesnt run at all. It gave me signal 7 bus error
I tried again with minimum no of cores
mpirun -np 2 /usr/local/bin/abinit < BaO-trf2-1.files >& RUN.log
It ran a little bit and then again the same error
I make it again using command
make multi multi_nprocs=10
sudo make install
Now it is running with command
mpirun -np 4 /usr/local/bin/abinit < BaO-trf2-1.files >& RUN.log
but taking too long. It seems not running on parallel cores
Inside log file I noticed one issue
--- !WARNING
src_file: m_nctk.F90
src_line: 539
message: |
The netcdf library does not support parallel IO, see message above
Abinit won't be able to produce files in parallel e.g. when paral_kgb==1 is used.
Action: install a netcdf4+HDF5 library with MPI-IO support.
Is it the reason? or anything else?
How to resolve the issue? Any help will be appreciated.
My sponce.ac file is as follows
enable_mpi="yes"
enable_mpi_io="yes"
with_mpi_prefix="/usr"
with_trio_flavor="netcdf+etsf_io"
with_netcdf_incs="-I/usr/include"
with_netcdf_libs="-L/usr/lib -lnetcdf -lnetcdff"
with_fft_flavor="fftw3"
with_fft_incs="-I/usr/include/"
with_fft_libs="-L/usr/lib/x86_64-linux-gnu/ -lfftw3 -lfftw3f"
with_linalg_flavor="atlas"
with_linalg_libs="-L/usr/lib -llapack -lf77blas -lcblas -latlas"
with_dft_flavor="atompaw+libxc"
#with_dft_flavor="atompaw+bigdft+libxc+wannier90"
enable_gw_dpc="yes"
with_mpi_level="2"
FC="/usr/bin/mpif90"
CC="/usr/bin/mpicc"
CXX="/usr/bin/mpic++"
I then configure abinit using command from inside build folder
../configure --with-config-file="./sponce.ac"
it ran succesfully then I make abinit using command
make multi multi_nprocs=8
then
sudo make install
then I submit the job using command
mpirun -np 8 /usr/local/bin/abinit < BaO-trf2-1.files >& RUN.log
The job doesnt run at all. It gave me signal 7 bus error
I tried again with minimum no of cores
mpirun -np 2 /usr/local/bin/abinit < BaO-trf2-1.files >& RUN.log
It ran a little bit and then again the same error
I make it again using command
make multi multi_nprocs=10
sudo make install
Now it is running with command
mpirun -np 4 /usr/local/bin/abinit < BaO-trf2-1.files >& RUN.log
but taking too long. It seems not running on parallel cores
Inside log file I noticed one issue
--- !WARNING
src_file: m_nctk.F90
src_line: 539
message: |
The netcdf library does not support parallel IO, see message above
Abinit won't be able to produce files in parallel e.g. when paral_kgb==1 is used.
Action: install a netcdf4+HDF5 library with MPI-IO support.
Is it the reason? or anything else?
How to resolve the issue? Any help will be appreciated.
Re: Issue Regarding Parallel Installation of Abinit 8.0.7
Hi,
The compilation should not depend on the number of cores you use to compile abinit. make or make mj4 or whatever should result in the same executable.
The last warning you have seems to be related to the fact that you link with netcdf but maybe not with the hdf5 version of netcdf. (the tutorial you follow is a little bit old but nevermind)
Can you at least provide the error message you have instead of just signal 7 bus error ? Can you run any other job with mpi ? Try to be more specific.
Cheers
The compilation should not depend on the number of cores you use to compile abinit. make or make mj4 or whatever should result in the same executable.
The last warning you have seems to be related to the fact that you link with netcdf but maybe not with the hdf5 version of netcdf. (the tutorial you follow is a little bit old but nevermind)
Can you at least provide the error message you have instead of just signal 7 bus error ? Can you run any other job with mpi ? Try to be more specific.
Cheers
Re: Issue Regarding Parallel Installation of Abinit 8.0.7
Hi,
Thanks for your response, the complete error message in log file is
Program received signal SIGBUS: Access to an undefined portion of a memory object.
Backtrace for this error:
#0 0x7FCE9356B777
#1 0x7FCE9356BD7E
#2 0x7FCE92A89CAF
#3 0x7FCE8458233A
#4 0x7FCE916584E8
#5 0x7FCE91658807
#6 0x7FCE8457FB53
#7 0x7FCE84FCD75C
#8 0x7FCE853D8F1A
#9 0x7FCE9166FB54
#10 0x7FCE9168665F
#11 0x7FCE938930F7
#12 0x12B33CA in __m_xmpi_MOD_xmpi_init at m_xmpi.F90:601
#13 0x40F9BE in abinit at abinit.F90:215
--------------------------------------------------------------------------
mpirun noticed that process rank 0 with PID 13645 on node hachemi exited on signal 7 (Bus error).
--------------------------------------------------------------------------
Thanks for your response, the complete error message in log file is
Program received signal SIGBUS: Access to an undefined portion of a memory object.
Backtrace for this error:
#0 0x7FCE9356B777
#1 0x7FCE9356BD7E
#2 0x7FCE92A89CAF
#3 0x7FCE8458233A
#4 0x7FCE916584E8
#5 0x7FCE91658807
#6 0x7FCE8457FB53
#7 0x7FCE84FCD75C
#8 0x7FCE853D8F1A
#9 0x7FCE9166FB54
#10 0x7FCE9168665F
#11 0x7FCE938930F7
#12 0x12B33CA in __m_xmpi_MOD_xmpi_init at m_xmpi.F90:601
#13 0x40F9BE in abinit at abinit.F90:215
--------------------------------------------------------------------------
mpirun noticed that process rank 0 with PID 13645 on node hachemi exited on signal 7 (Bus error).
--------------------------------------------------------------------------
Re: Issue Regarding Parallel Installation of Abinit 8.0.7
This is a problem with your MPI installation, not with Abinit. Please consult their documentation / forums / mailing lists and/or re-install MPI.
Yann Pouillon
Simune Atomistics
Donostia-San Sebastián, Spain
Simune Atomistics
Donostia-San Sebastián, Spain
-
- Posts: 13
- Joined: Thu Jun 16, 2016 8:47 am
- Location: Bruxelles, Be
- Contact:
Re: Issue Regarding Parallel Installation of Abinit 8.0.7
Hi guys,
I actually have the same problem. Even though parallelism is there (cpu time different for different mpi runs), I get these two warning messages:
I re-installed netcdf and hdf5 within anaconda, and re-installed abinit8 right after, but the problem is still there.
I also tried to modify the line `` with-trio-flavor="netcdf+etsf_io" `` in the .ac file and noticed that in the end the configuration does not care if you set netcdf, or netcdf+whatever, since TRIO flavor is set to None. You have to type it by hand after configure to get it correctly.
This looks a bit suspicious to me.
cheers
I actually have the same problem. Even though parallelism is there (cpu time different for different mpi runs), I get these two warning messages:
Code: Select all
--- !WARNING
src_file: m_nctk.F90
src_line: 526
message: |
Strange, netcdf seems to support MPI-IO but: NetCDF: Not a valid ID
...
--- !WARNING
src_file: m_nctk.F90
src_line: 539
message: |
The netcdf library does not support parallel IO, see message above
Abinit won't be able to produce files in parallel e.g. when paral_kgb==1 is used.
Action: install a netcdf4+HDF5 library with MPI-IO support.
...
I re-installed netcdf and hdf5 within anaconda, and re-installed abinit8 right after, but the problem is still there.
I also tried to modify the line `` with-trio-flavor="netcdf+etsf_io" `` in the .ac file and noticed that in the end the configuration does not care if you set netcdf, or netcdf+whatever, since TRIO flavor is set to None. You have to type it by hand after configure to get it correctly.
This looks a bit suspicious to me.
cheers
Marco Di Gennaro
Toyota Motor Europe (Be)
Toyota Motor Europe (Be)
Re: Issue Regarding Parallel Installation of Abinit 8.0.7
Hi Marco,
I don't know if a typo when you write your post but it's
with_trio_flavor not with-trio-flavor
jmb
I also tried to modify the line `` with-trio-flavor="netcdf+etsf_io" `` in the .ac file and noticed that in the end the configuration does not care if you set netcdf, or netcdf+whatever, since TRIO flavor is set to None. You have to type it by hand after configure to get it correctly.
I don't know if a typo when you write your post but it's
with_trio_flavor not with-trio-flavor
jmb
------
Jean-Michel Beuken
Computer Scientist
Jean-Michel Beuken
Computer Scientist
-
- Posts: 13
- Joined: Thu Jun 16, 2016 8:47 am
- Location: Bruxelles, Be
- Contact:
Re: Issue Regarding Parallel Installation of Abinit 8.0.7
Thanks Jean Michel,
that is absolutely right. But the warning regarding netcdf is still there.
BR
that is absolutely right. But the warning regarding netcdf is still there.
BR
Marco Di Gennaro
Toyota Motor Europe (Be)
Toyota Motor Europe (Be)