Dear all,
I have managed to successfully install OpenMPI-1.10.3 and Abinit-8.0.8 over my 5 node cluster. I then proceeded to run a job with this command
mpirun -np 2 -machinefile cluster abinit < t4x.files
In the file cluster I have the IP addresses given as such but with the last three commented out
node1
node2
#node3
#node4
#node5
It proceeds to run the job as given in my input file. If I change it to this command
mpirun -np 3 -machinefile cluster abinit < t4x.files
In the file cluster I have the IP addresses given as such but with the last two commented out
node1
node2
node3
#node4
#node5
I get
Permission denied, please try again.
Permission denied, please try again.
Permission denied (publickey,gssapi-keyex,gssapi-with-mic,password).
--------------------------------------------------------------------------
ORTE was unable to reliably start one or more daemons.
This usually is caused by:
* not finding the required libraries and/or binaries on
one or more nodes. Please check your PATH and LD_LIBRARY_PATH
settings, or configure OMPI with --enable-orterun-prefix-by-default
* lack of authority to execute on one or more specified nodes.
Please verify your allocation and authorities.
* the inability to write startup files into /tmp (--tmpdir/orte_tmpdir_base).
Please check with your sys admin to determine the correct location to use.
* compilation of the orted with dynamic libraries when static are required
(e.g., on Cray). Please check your configure cmd line and consider using
one of the contrib/platform definitions for your system type.
* an inability to create a connection back to mpirun due to a
lack of common network interfaces and/or no route found between
them. Please check network connectivity (including firewalls
and network routing requirements).
In the case of -np 2, I can choose any two IP addresses and it will still run a parallel job, but the job does not run when -np 3,4 or 5.
I'd appreciate any help provided. Thank you.
Problem with number of nodes during mpirun
Moderators: fgoudreault, mcote
Forum rules
Please have a look at ~abinit/doc/config/build-config.ac in the source package for detailed and up-to-date information about the configuration of Abinit 8 builds.
For a video explanation on how to build Abinit 7.x for Linux, please go to: http://www.youtube.com/watch?v=DppLQ-KQA68.
IMPORTANT: when an answer solves your problem, please check the little green V-like button on its upper-right corner to accept it.
Please have a look at ~abinit/doc/config/build-config.ac in the source package for detailed and up-to-date information about the configuration of Abinit 8 builds.
For a video explanation on how to build Abinit 7.x for Linux, please go to: http://www.youtube.com/watch?v=DppLQ-KQA68.
IMPORTANT: when an answer solves your problem, please check the little green V-like button on its upper-right corner to accept it.
-
- Posts: 4
- Joined: Mon Apr 18, 2011 8:54 am
Re: Problem with number of nodes during mpirun
Hi,
Unfortunately this is not an abinit related issue. You should contact your vendor in order to help you set up the MPI configuration.
Please, check that all you nodes have correctly the same MPI version installed, that you can ssh to all nodes without typing a password and that all nodes can access the same filesystem.
If it works with -np 2, please check that you are not running on only one node which of course must work.
Cheers
Unfortunately this is not an abinit related issue. You should contact your vendor in order to help you set up the MPI configuration.
Please, check that all you nodes have correctly the same MPI version installed, that you can ssh to all nodes without typing a password and that all nodes can access the same filesystem.
If it works with -np 2, please check that you are not running on only one node which of course must work.
Cheers