Hello everybody,
I am trying to compile the git version abinit 9.0.4 on the cluster beluga.
I am able to configure, make and install. I can ask abinit --version or abinit --build. However, whenever I try to start a simulation, I get a Segmentation fault inside the linalg module even while using 1 proc. I've added the stackTrace at the end of the post. Also, the output, ac9 file and config log are added as an attachments.
I've tried to configure directly in the shell and from an interactive session. This is the command I use : ../configure --with-mpi -enable-openmp --with-config-file=olivier.ac9 --prefix="/path/to/Installation/folder/"
Inside the log, I'm getting 2 errors in the linalg section :
1. I don't have Elpa.
2. I don't have <lapacke.h> while trying to use LAPACKE C API support.
Does somebody have an idea where the error could be coming from and how to fix it?
Thank you,
Olivier
Note : I've removed mpi-io to help pinpoint the error.
==== backtrace ====
0 0x0000000000010e90 __funlockfile() ???:0
1 0x0000000000097201 PMPI_Comm_size() ???:0
2 0x0000000000029de9 MKLMPI_Comm_size() ???:0
3 0x0000000000027fb1 mkl_blacs_init() ???:0
4 0x0000000000027ef8 Cblacs_pinfo() ???:0
5 0x00000000000187f9 blacs_gridmap_() ???:0
6 0x00000000000181ce blacs_gridinit_() ???:0
7 0x00000000025bc394 m_slk_mp_init_scalapack_() ???:0
8 0x000000000252a26b m_abi_linalg_mp_abi_linalg_init_() ???:0
9 0x000000000041bda7 m_driver_mp_driver_() ???:0
10 0x000000000040b687 MAIN__() ???:0
11 0x000000000040a0fe main() ???:0
12 0x00000000000202e0 __libc_start_main() ???:0
13 0x000000000040a01a _start() /tmp/nix-build-glibc-2.24.drv-0/glibc-2.24/csu/../sysdeps/x86_64/start.S:120
===================
Abinit 9.0.4, linalg segfault on cluster [SOLVED]
Moderators: fgoudreault, mcote
Forum rules
Please have a look at ~abinit/doc/config/build-config.ac in the source package for detailed and up-to-date information about the configuration of Abinit 8 builds.
For a video explanation on how to build Abinit 7.x for Linux, please go to: http://www.youtube.com/watch?v=DppLQ-KQA68.
IMPORTANT: when an answer solves your problem, please check the little green V-like button on its upper-right corner to accept it.
Please have a look at ~abinit/doc/config/build-config.ac in the source package for detailed and up-to-date information about the configuration of Abinit 8 builds.
For a video explanation on how to build Abinit 7.x for Linux, please go to: http://www.youtube.com/watch?v=DppLQ-KQA68.
IMPORTANT: when an answer solves your problem, please check the little green V-like button on its upper-right corner to accept it.
Abinit 9.0.4, linalg segfault on cluster
- Attachments
-
config.log
- Config Log
- (377.32 KiB) Downloaded 219 times
-
ac9.log
- olivier.ac9
- (43.73 KiB) Downloaded 219 times
-
output.log
- The output of configure
- (51.47 KiB) Downloaded 232 times
Re: Abinit 9.0.4, linalg segfault on cluster [SOLVED]
Hi,
your ac9 file without comment :
and , tail of output.log :
I'm not familiar with a linalg config like this and I don't know the "-mkl=cluster" option.
But I think I see a problem...
At the end of the output, we see
However, you have configured LINALG in the file ac9
If the information was correct, we should have had this output:
I think the path is wrong.
if you execute this command :
do you see the librairies ( liblapack,... )?
In "my" cluster , the end of path is : mkl/lib/intel64
jmb
your ac9 file without comment :
Code: Select all
CC="mpicc"
CFLAGS="-O2 -xCore-AVX512 -ftz -fp-speculation=safe -fp-model source -mkl=cluster"
CXX="mpic++"
CXXFLAGS="-O2 -xCore-AVX512 -ftz -fp-speculation=safe -fp-model source -mkl=cluster"
FC="mpif90"
FCFLAGS="-O2 -xCore-AVX512 -ftz -fp-speculation=safe -fp-model source -mkl=cluster"
with_mpi="yes"
with_mpi_flavor="auto"
enable_mpi_inplace="yes"
enable_mpi_io="no"
with_linalg="/cvmfs/soft.computecanada.ca/easybuild/software/2017/Core/imkl/2018.3.222"
LINALG_LIBS="-L/cvmfs/soft.computecanada.ca/easybuild/software/2017/Core/imkl/2018.3.222/mkl/lib -llapack -lblas -lscalapack"
with_libxc="/cvmfs/soft.computecanada.ca/easybuild/software/2017/avx2/Compiler/intel2018.3/libxc/4.3.4"
with_hdf5="/cvmfs/soft.computecanada.ca/easybuild/software/2017/avx512/Compiler/intel2018.3/hdf5/1.10.3"
H5CC="/cvmfs/soft.computecanada.ca/easybuild/software/2017/avx512/Compiler/intel2018.3/hdf5/1.10.3/bin/h5cc"
with_netcdf="/cvmfs/soft.computecanada.ca/easybuild/software/2017/avx512/Compiler/intel2018.3/netcdf/4.6.1"
with_netcdf_fortran="/cvmfs/soft.computecanada.ca/easybuild/software/2017/avx512/Compiler/intel2018.3/netcdf-fortran/4.4.4"
Code: Select all
Core build parameters
---------------------
* C compiler : intel version 18.0
* Fortran compiler : intel version 18.0
* architecture : intel xeon (64 bits)
* debugging : basic
* optimizations : standard
* OpenMP enabled : yes (collapse: yes)
* MPI enabled : yes (flavor: auto)
* MPI in-place : yes
* MPI-IO enabled : no
* GPU enabled : no (flavor: none)
* LibXML2 enabled : no
* HDF5 enabled : yes (MPI support: no)
* NetCDF enabled : yes (MPI support: no)
* NetCDF-F enabled : yes (MPI support: no)
* FFT flavor : dfti (libs: auto-detected)
* LINALG flavor : mkl (libs: auto-detected)
But I think I see a problem...
At the end of the output, we see
Code: Select all
LINALG flavor : mkl (libs: auto-detected)
Code: Select all
with_linalg="/cvmfs/soft.computecanada.ca/easybuild/software/2017/Core/imkl/2018.3.222"
LINALG_LIBS="-L/cvmfs/soft.computecanada.ca/easybuild/software/2017/Core/imkl/2018.3.222/mkl/lib -llapack -lblas -lscalapack"
Code: Select all
* LINALG flavor : mkl (libs: user-defined)
if you execute this command :
Code: Select all
ls /cvmfs/soft.computecanada.ca/easybuild/software/2017/Core/imkl/2018.3.222/mkl/lib
In "my" cluster , the end of path is : mkl/lib/intel64
jmb
------
Jean-Michel Beuken
Computer Scientist
Jean-Michel Beuken
Computer Scientist
Re: Abinit 9.0.4, linalg segfault on cluster
Thanks for the quick reply.
You are right, It looks like I was just missing intel64 at the end my library path.
Everything seems to works fine now.
Best regards,
Olivier
You are right, It looks like I was just missing intel64 at the end my library path.
Everything seems to works fine now.
Best regards,
Olivier