exit code 14?

Phonons, DFPT, electron-phonon, electric-field response, mechanical response…

Moderators: mverstra, joaocarloscabreu

Locked
User avatar
jzwanzig
Posts: 504
Joined: Mon Aug 17, 2009 9:25 am

exit code 14?

Post by jzwanzig » Wed Sep 28, 2011 12:25 pm

Hi,
I'm trying to run a strain response function calculation with a very large unit cell, and it is failing with linux exit code 14. Other perturbations on the same system are working (ddk completed, electric field and phonons at gamma are running successfully as separate jobs). I can't reproduce this failure on a smaller system. Can anyone give hints on how to learn what is going wrong, based on the exit code?

thanks,
Joe
Josef W. Zwanziger
Professor, Department of Chemistry
Canada Research Chair in NMR Studies of Materials
Dalhousie University
Halifax, NS B3H 4J3 Canada
jzwanzig@gmail.com

Magniff
Posts: 18
Joined: Thu May 19, 2011 4:46 am

Re: exit code 14?

Post by Magniff » Thu Sep 29, 2011 2:03 pm

I have exit code 11 on centos, so jzwanzig if you figure out by yourself what does it mean please post it here;

mverstra
Posts: 655
Joined: Wed Aug 19, 2009 12:01 pm

Re: exit code 14?

Post by mverstra » Mon Apr 02, 2012 7:03 pm

Hi Joe,

I'm afraid the only thing that works is running interactively (or as a batch) and catching the main abinit process (presuming this is a parallel job) with gdb:

gdb /path/to/abinit <pid>

gdb prompt> continue
.
.
.
crash with some information...

You can also run your smaller calculations with valgrind:

valgrind --log-file=abi --track-origins=yes --leak-check=full -v --show-reachable=yes /path/to/abinit < files.file
.
.
.
runs for a long time (30x slower than normal)
read file "abi" or "abi.<pid>" to see errors in allocation or initialization. Often the error is present and seen by valgrind even if it does not crash.

cheers

Matthieu
Matthieu Verstraete
University of Liege, Belgium

Locked