Page 1 of 1

[Solved] Parallelism of multi-datasets

Posted: Fri Jun 22, 2012 11:11 pm
by ljludwig
Hello All:

There is a peculiar problem, that whenever I use multidataset mode (ndtset >1) in parallel abinit (mpirun), it reports problem in the log file:

MPI_ABORT was invoked on rank 0 in communicator MPI_COMM_WORLD with errorcode 14.

NOTE: invoking MPI_ABORT causes Open MPI to kill all MPI processes. You may or may not see output from other processes, depending on exactly when Open MPI kills them.

On the other hand,
1) the .in file and .files file are good, since I can run them in serial with multi-datasets.
2) the mpi seems good, too, since without multi-dataset mode, it can run in mpi.


The log file tells more information:
MPI_ERROR_STRING: MPI_ERR_UNKNOWN: unknown error
--------------------------------------------------------------------------
mpirun has exited due to process rank 0 with PID 24963 on
node node7 exiting improperly. There are two reasons this could occur:

1. this process did not call "init" before exiting, but others in the job did. This can cause a job to hang indefinitely while it waits for all processes to call "init". By rule, if one process calls "init", then ALL processes must call "init" prior to termination.

2. this process called "init", but exited without calling "finalize". By rule, all processes that call "init" MUST call "finalize" prior to exiting or it will be considered an "abnormal termination". This may have caused other processes in the application to be terminated by signals sent by mpirun (as reported here).


The information only says to call "init", but does not tell how to call it. I tried to look up the answer but still don't get it.

Could anyone shed some light on this problem? It is really frustrating. Thank you in advance.

Re: [Solved] Parallelism of multi-datasets

Posted: Tue Jun 26, 2012 3:25 pm
by ljludwig
In this issue, I may have to reconcile the problem with the number of datasets are greater than the number of cpus... It might be due to the different configuration procedure in compiling abinit.