Page 1 of 1

exit code 14?

Posted: Wed Sep 28, 2011 12:25 pm
by jzwanzig
Hi,
I'm trying to run a strain response function calculation with a very large unit cell, and it is failing with linux exit code 14. Other perturbations on the same system are working (ddk completed, electric field and phonons at gamma are running successfully as separate jobs). I can't reproduce this failure on a smaller system. Can anyone give hints on how to learn what is going wrong, based on the exit code?

thanks,
Joe

Re: exit code 14?

Posted: Thu Sep 29, 2011 2:03 pm
by Magniff
I have exit code 11 on centos, so jzwanzig if you figure out by yourself what does it mean please post it here;

Re: exit code 14?

Posted: Mon Apr 02, 2012 7:03 pm
by mverstra
Hi Joe,

I'm afraid the only thing that works is running interactively (or as a batch) and catching the main abinit process (presuming this is a parallel job) with gdb:

gdb /path/to/abinit <pid>

gdb prompt> continue
.
.
.
crash with some information...

You can also run your smaller calculations with valgrind:

valgrind --log-file=abi --track-origins=yes --leak-check=full -v --show-reachable=yes /path/to/abinit < files.file
.
.
.
runs for a long time (30x slower than normal)
read file "abi" or "abi.<pid>" to see errors in allocation or initialization. Often the error is present and seen by valgrind even if it does not crash.

cheers

Matthieu