Abinit 7.6.3 ./runtests.py problem

option, parallelism,...

Moderators: fgoudreault, mcote

Forum rules
Please have a look at ~abinit/doc/config/build-config.ac in the source package for detailed and up-to-date information about the configuration of Abinit 8 builds.
For a video explanation on how to build Abinit 7.x for Linux, please go to: http://www.youtube.com/watch?v=DppLQ-KQA68.
IMPORTANT: when an answer solves your problem, please check the little green V-like button on its upper-right corner to accept it.
Locked
sheng
Posts: 64
Joined: Fri Apr 11, 2014 3:44 pm

Abinit 7.6.3 ./runtests.py problem

Post by sheng » Wed Jun 11, 2014 7:27 pm

I have managed to compile Abinit 7.6.3 with gcc version 4.9.0 on a rocks cluster node. The prefix for the components' libraries (such as Atlas, netcdf etc) is a non-standard library path: /share/apps/...
The program can be successfully executed when called, but error arises when I wish to run ./runtests to verify my compilation:

[sheng@comsics tests]$ ./runtests.py -j 4 fast
-bash: ./runtests.py: No such file or directory

I suspect that I have not added all the needed libraries to the system library paths, any idea on how to fix this? I have already added all the paths for the softwares I choose in configure step.

sheng
Posts: 64
Joined: Fri Apr 11, 2014 3:44 pm

Re: Abinit 7.6.3 ./runtests.py problem

Post by sheng » Thu Jun 12, 2014 8:58 am

I have noticed this is simply a directory linking and i have fix it by refering the directory in source file containing runtests.py.
The tests can be completed, but there is some other timeout error before the test is started.

Code: Select all

[sheng@comsics tests]$ ../../tests/runtests.py paral -n 2 -j 2
/home/sheng/Desktop/Program/abinit-7.6.3/tests/pymods/testsuite.py:802: UserWarning: Cannot find path of bin_name timeout, neither in the build directory nor in PATH ['/share/apps/fftw/3.3.4/bin', '/share/apps/netcdf/4.3.2/bin', '/share/apps/abinit/7.6.3gcc/bin', '/share/apps/gcc/4.9.0/bin', '/share/apps/openmpi/1.8.1/gcc4.9.0/bin', '/opt/gridengine/bin/lx26-amd64', '/usr/kerberos/bin', '/usr/java/latest/bin', '/share/apps/gcc/4.9.0/bin', '/share/apps/openmpi/1.8.1/gcc4.9.0/bin', '/usr/local/bin', '/bin', '/usr/bin', '/opt/ganglia/bin', '/opt/ganglia/sbin', '/opt/openmpi/bin/', '/opt/rocks/bin', '/opt/rocks/sbin', '/home/sheng/.sage/bin', '/opt/sun-ct/bin', '/home/sheng/bin']
  warn(err_msg)
/home/sheng/Desktop/Program/abinit-7.6.3/tests/pymods/testsuite.py:762: UserWarning: Cannot find timeout executable!
  warn("Cannot find timeout executable!")
../../tests/runtests.py:213: UserWarning: Cannot find timeout executable at:
  warn("Cannot find timeout executable at: %s" % build_env.path_of_bin("timeout"))
Test_suite directory already exists! Old files will be removed
Running ntests = 93, MPI_nprocs = 2, py_nthreads = 2...
[paral][t01_MPI1]: Skipped
...(more)


Is this error significant?

Another problem is that when I run sequential tests as described in http://forum.abinit.org/viewtopic.php?f=2&t=2639, the run_etime is about 2000, which is significantly larger than the value in the link above (44 only). However the run time for the parallel test is comparable to value in the link (718.25 to 815.67).

Thank you.

User avatar
pouillon
Posts: 651
Joined: Wed Aug 19, 2009 10:08 am
Location: Spain
Contact:

Re: Abinit 7.6.3 ./runtests.py problem

Post by pouillon » Thu Jun 12, 2014 9:50 am

Running the test suite with the "-t 0" option will remove the warning. The timeout utility allows for the interruption of a test if it lasts too long. It is thus not an issue if you don't have it compiled.

Regarding the execution times of the tests, you will experience large differences depending on the architecture/OS/compiler/build flags/parallel environment you have. You may use them to check your build parameters.
Yann Pouillon
Simune Atomistics
Donostia-San Sebastián, Spain

sheng
Posts: 64
Joined: Fri Apr 11, 2014 3:44 pm

Re: Abinit 7.6.3 ./runtests.py problem

Post by sheng » Thu Jun 12, 2014 4:43 pm

Thank you for your reply, but the same error still persists even after I have appended the "-t 0" flag.

Regarding the computation time, I think it is weird for my calculation time to differ so much to the link http://forum.abinit.org/viewtopic.php?f=2&t=2639 (my time = 2000+, link = 44).
I have compiled with almost same configuration as the recipe in the link suggests.
The libraries are updated to the newest version supported by abinit. My gcc version is 4.9.0 and openmpi version is 1.8.1, which is both updated versions. However my Pyhton version is the old 2.4.3.

I am building abinit on a Rocks cluster and plan to run parallel using mpi and openmp.

Attached is my processor info:

Code: Select all

processor       : 0
vendor_id       : GenuineIntel
cpu family      : 6
model           : 30
model name      : Intel(R) Core(TM) i5 CPU         750  @ 2.67GHz
stepping        : 5
cpu MHz         : 2660.080
cache size      : 8192 KB
physical id     : 0
siblings        : 4
core id         : 0
cpu cores       : 4
apicid          : 0
fpu             : yes
fpu_exception   : yes
cpuid level     : 11
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall nx rdtscp lm constant_tsc ida nonstop_tsc pni monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr sse4_1 sse4_2 popcnt lahf_lm
bogomips        : 5320.16
clflush size    : 64
cache_alignment : 64
address sizes   : 36 bits physical, 48 bits virtual
power management: [8]

processor       : 1
vendor_id       : GenuineIntel
cpu family      : 6
model           : 30
model name      : Intel(R) Core(TM) i5 CPU         750  @ 2.67GHz
stepping        : 5
cpu MHz         : 2660.080
cache size      : 8192 KB
physical id     : 0
siblings        : 4
core id         : 1
cpu cores       : 4
apicid          : 2
fpu             : yes
fpu_exception   : yes
cpuid level     : 11
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall nx rdtscp lm constant_tsc ida nonstop_tsc pni monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr sse4_1 sse4_2 popcnt lahf_lm
bogomips        : 5319.89
clflush size    : 64
cache_alignment : 64
address sizes   : 36 bits physical, 48 bits virtual
power management: [8]

processor       : 2
vendor_id       : GenuineIntel
cpu family      : 6
model           : 30
model name      : Intel(R) Core(TM) i5 CPU         750  @ 2.67GHz
stepping        : 5
cpu MHz         : 2660.080
cache size      : 8192 KB
physical id     : 0
siblings        : 4
core id         : 2
cpu cores       : 4
apicid          : 4
fpu             : yes
fpu_exception   : yes
cpuid level     : 11
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall nx rdtscp lm constant_tsc ida nonstop_tsc pni monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr sse4_1 sse4_2 popcnt lahf_lm
bogomips        : 5319.96
clflush size    : 64
cache_alignment : 64
address sizes   : 36 bits physical, 48 bits virtual
power management: [8]

processor       : 3
vendor_id       : GenuineIntel
cpu family      : 6
model           : 30
model name      : Intel(R) Core(TM) i5 CPU         750  @ 2.67GHz
stepping        : 5
cpu MHz         : 2660.080
cache size      : 8192 KB
physical id     : 0
siblings        : 4
core id         : 3
cpu cores       : 4
apicid          : 6
fpu             : yes
fpu_exception   : yes
cpuid level     : 11
wp              : yes
flags           : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm syscall nx rdtscp lm constant_tsc ida nonstop_tsc pni monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr sse4_1 sse4_2 popcnt lahf_lm
bogomips        : 5319.97
clflush size    : 64
cache_alignment : 64
address sizes   : 36 bits physical, 48 bits virtual
power management: [8]


I don't think my pc specs are so outdated until my calculation to be so slow. My parallel test time is almost the same as the link, so there may be problems in the serial test. Thanks you for your time.

ps. sometimes there is directory lock error when I run abinit in one of the cluster nodes (I install it in the main pc and share the executable bin and libraries to all the cluster node).

Code: Select all

[sheng@compute-0-18 tests]$ ../../tests/runtests.py paral -t 0 -n 4 -j 2
/home/sheng/Desktop/Program/abinit-7.6.3/tests/pymods/testsuite.py:802: UserWarning: Cannot find path of bin_name timeout, neither in the build directory nor in PATH ['/share/apps/fftw/3.3.4/bin', '/share/apps/netcdf/4.3.2/bin', '/share/apps/abinit/7.6.3gcc/bin', '/share/apps/gcc/4.9.0/bin', '/share/apps/openmpi/1.8.1/gcc4.9.0/bin', '/opt/gridengine/bin/lx26-amd64', '/usr/kerberos/bin', '/usr/java/latest/bin', '/usr/local/bin', '/bin', '/usr/bin', '/opt/ganglia/bin', '/opt/ganglia/sbin', '/opt/openmpi/bin/', '/opt/rocks/bin', '/opt/rocks/sbin', '/opt/sun-ct/bin', '/home/sheng/bin']
  warn(err_msg)
/home/sheng/Desktop/Program/abinit-7.6.3/tests/pymods/testsuite.py:762: UserWarning: Cannot find timeout executable!
  warn("Cannot find timeout executable!")
../../tests/runtests.py:213: UserWarning: Cannot find timeout executable at:
  warn("Cannot find timeout executable at: %s" % build_env.path_of_bin("timeout"))
Test_suite directory already exists! Old files will be removed
Running ntests = 93, MPI_nprocs = 4, py_nthreads = 2...
/home/sheng/Desktop/Program/abinit-7.6.3/tests/pymods/testsuite.py:2599: UserWarning: Timeout occured while trying to acquire the directory lock in /home/sheng/Desktop/Program/abinit-7.6.3/build/tests/Test_suite.
 Returning
  warn("Timeout occured while trying to acquire the directory lock in %s.\n Returning" % self.workdir)

Locked