Hi,
I'm trying to run a strain response function calculation with a very large unit cell, and it is failing with linux exit code 14. Other perturbations on the same system are working (ddk completed, electric field and phonons at gamma are running successfully as separate jobs). I can't reproduce this failure on a smaller system. Can anyone give hints on how to learn what is going wrong, based on the exit code?
thanks,
Joe
exit code 14?
Moderators: mverstra, joaocarloscabreu
exit code 14?
Josef W. Zwanziger
Professor, Department of Chemistry
Canada Research Chair in NMR Studies of Materials
Dalhousie University
Halifax, NS B3H 4J3 Canada
jzwanzig@gmail.com
Professor, Department of Chemistry
Canada Research Chair in NMR Studies of Materials
Dalhousie University
Halifax, NS B3H 4J3 Canada
jzwanzig@gmail.com
Re: exit code 14?
I have exit code 11 on centos, so jzwanzig if you figure out by yourself what does it mean please post it here;
Re: exit code 14?
Hi Joe,
I'm afraid the only thing that works is running interactively (or as a batch) and catching the main abinit process (presuming this is a parallel job) with gdb:
gdb /path/to/abinit <pid>
gdb prompt> continue
.
.
.
crash with some information...
You can also run your smaller calculations with valgrind:
valgrind --log-file=abi --track-origins=yes --leak-check=full -v --show-reachable=yes /path/to/abinit < files.file
.
.
.
runs for a long time (30x slower than normal)
read file "abi" or "abi.<pid>" to see errors in allocation or initialization. Often the error is present and seen by valgrind even if it does not crash.
cheers
Matthieu
I'm afraid the only thing that works is running interactively (or as a batch) and catching the main abinit process (presuming this is a parallel job) with gdb:
gdb /path/to/abinit <pid>
gdb prompt> continue
.
.
.
crash with some information...
You can also run your smaller calculations with valgrind:
valgrind --log-file=abi --track-origins=yes --leak-check=full -v --show-reachable=yes /path/to/abinit < files.file
.
.
.
runs for a long time (30x slower than normal)
read file "abi" or "abi.<pid>" to see errors in allocation or initialization. Often the error is present and seen by valgrind even if it does not crash.
cheers
Matthieu
Matthieu Verstraete
University of Liege, Belgium
University of Liege, Belgium