Calculation of ground state with GPU
Posted: Thu Aug 02, 2012 9:11 pm
Hello everyone,
I have been using GPU to calculate the ground state and it's been working fine until I seemed to hit some kind of barrier past a certain value of ecut. I have been trying 2 different configurations of atoms and for both when ecut gets larger, past some point I get the CUFFT_EXEC_FAILED error for the backward fourrier transform. For both configurations the only thing I have noticed to be alike was the value of mpw in which at 65795 the run failed, and at 65267 (which correspond to .1 ecut under the value where it failed) the run succeeded.
So far I have tried to force a bigger fft grid, free the memory of the gpu and reallocate everything before the fft, send the fft calculation to another gpu (I am working on a cluster with 8 cards on it), set every points of the fft grid with zeroes, commented every line of the source code except for the backward fourrier transform, and used another cluster with cards with 3-4 times more RAM than the ones I'm usually working on, but I still get the same error at the exact spot for the same values of ecut.
I have activated double-precision (I get the same error with single precision), the GPUs are GeForce GTX 580 with 1.5 gigs of ram and the cuda version used is 4.0.17.
Anyone got this problem before and were able to fix it?
Thank you.
I have been using GPU to calculate the ground state and it's been working fine until I seemed to hit some kind of barrier past a certain value of ecut. I have been trying 2 different configurations of atoms and for both when ecut gets larger, past some point I get the CUFFT_EXEC_FAILED error for the backward fourrier transform. For both configurations the only thing I have noticed to be alike was the value of mpw in which at 65795 the run failed, and at 65267 (which correspond to .1 ecut under the value where it failed) the run succeeded.
So far I have tried to force a bigger fft grid, free the memory of the gpu and reallocate everything before the fft, send the fft calculation to another gpu (I am working on a cluster with 8 cards on it), set every points of the fft grid with zeroes, commented every line of the source code except for the backward fourrier transform, and used another cluster with cards with 3-4 times more RAM than the ones I'm usually working on, but I still get the same error at the exact spot for the same values of ecut.
I have activated double-precision (I get the same error with single precision), the GPUs are GeForce GTX 580 with 1.5 gigs of ram and the cuda version used is 4.0.17.
Anyone got this problem before and were able to fix it?
Thank you.