I have compiled abinit6.12.3 without errors and it runs properly when I am only using one computing node in our cluster (the computer node has 2 hexacore processors). The problem arises when I try to use more than one computing node: it crashes at the very beginning.
In the attached files there is all the information I could gather:
- File 'config.log': log file from the configuration of abinit. In brief: I used Intel MPi 4.0.3, Intel Compilers XE2013, Intel MKL BLACS and Intel MKL FFTW3
- File 'simul.log': log file of the simulation that crashes. I used several options within mpirun to get the information related to MPI calls since it seems that the problem is there ( -v -check_mpi -genv I_MPI_DEBUG 5).
- File 'abinit.in': the input file of the simulation I am trying to run, just in case it is meaningful. In brief, I want to generate the WFK necessary for a subsequent run to generate a KSS file. This input file works fine if I only use one computing node, so I don't think that the problem is here.
It seems from the log that the errors are related to MPI since there are messages such as:
Code: Select all
[23] ERROR: LOCAL:MPI:CALL_FAILED: error
[23] ERROR: Null communicator.
[23] ERROR: Error occurred at:
[23] ERROR: mpi_comm_rank_(comm=MPI_COMM_NULL, *rank=0x29319b8, *ierr=0x7fff83fabb74)
[23] ERROR: initmpi_grid_ (/home/ivasan/programas/abinit/abinit-6.12.3b/src/51_manage_mpi/initmpi_grid.F90:178)
[23] ERROR: invars1_ (/home/ivasan/programas/abinit/abinit-6.12.3b/src/57_iovars/invars1.F90:1015)
[23] ERROR: invars1m_ (/home/ivasan/programas/abinit/abinit-6.12.3b/src/57_iovars/invars1m.F90:186)
[23] ERROR: m_ab6_invars_mp_ab6_invars_load_ (/home/ivasan/programas/abinit/abinit-6.12.3b/src/57_iovars/m_ab6_invars_f90.F90:548)
[23] ERROR: MAIN__ (/home/ivasan/programas/abinit/abinit-6.12.3b/src/98_main/abinit.F90:260)
[23] ERROR: main (/home/ivasan/programas/abinit/abinit-6.12.3b/bin/abinit)
[23] ERROR: (/lib64/libc-2.5.so)
[23] ERROR: (/home/ivasan/programas/abinit/abinit-6.12.3b/bin/abinit)
I will appreciate if anyone could give me a hint about what can I check/modify in order to solve this problem.
Thank you very much in advance for your answers and your time.
Kind regards,
Iván