Make in-place FFT optional #155

samhatfield · 2024-09-19T14:02:01Z

It seems that hipFFT has a problem with in-place FFTs. Running on LUMI-G (MI250x), this error is thrown at runtime:

GPU runtime error at 1
GPU runtime error in file '/pfs/lustrep3/scratch/project_465000454/hatfield/jube-suites/ectrans/ectrans_gpu/000005/000000_clone/work/ectrans-bundle/source/ectrans/src/trans/gpu/algor/hicfft.hip.cpp'
GPU runtime error at 2
GPU runtime error line '48'
GPU runtime error at 3
GPU runtime error 6: HIPFFT_EXEC_FAILED
terminating!
ABOR1     [PROC=1,THRD=1] from [/pfs/lustrep3/scratch/project_465000454/hatfield/jube-suites/ectrans/ectrans_gpu/000005/000000_clone/work/ectrans-bundle/source/ectrans/src/trans/gpu/algor/hicfft.h +48] : Error in FFT

Some investigations with @PaulMullowney showed that if you create a separate buffer for the output, the error vanishes.

With this change we are now finally able to run the optimised GPU version of ecTrans (now the GPU version of ecTrans) on AMD GPUs, at least with a single MI250x GPU on LUMI-G. Multi-GPU runs are still a work in progress.

This PR makes in-place FFT compile-time adjustable. For now, in-place is enabled for cuFFT, but disabled for hipFFT.

@PaulMullowney and I will see what we can do to flag this to hipFFT developers so we can eventually remove this option.

Note that earlier HIPFFT_PARSE_ERRORs no longer appear with ROCm 6.0.2, which is the new default for LUMI-G.

Note that this requires testing on an Nvidia GPU. Hopefully we have not ruined things there.

This PR builds on #150 so best to merge that one first.

samhatfield · 2024-09-19T14:04:13Z

@lukasm91 can you see this having any performance implications for CUDA/cuFFT execution? I think I've put all the expensive stuff inside the guards so it shouldn't be compiled on an Nvidia platform.

lukasm91

Yes, this looks like it should not have perf implications (and I also quickly checked it on our machines)

src/trans/gpu/internal/ftdir_mod.F90

src/trans/gpu/internal/ftinv_mod.F90

This is currently disabled for cuFFT but enabled for hipFFT. In-place FFTs seem to be an issue for ROCm at the moment. This is a temporary workaround.

Co-authored-by: lukasm91 <[email protected]>

wdeconinck · 2024-10-08T10:30:25Z

src/trans/gpu/internal/ftdir_mod.F90

+    REAL(KIND=JPRBT) :: DUMMY
+
+#ifndef IN_PLACE_FFT
+    HFTDIR%HREEL_COMPLEX = RESERVE(ALLOCATOR, INT(KF_FS*D%NLENGTF*SIZEOF(DUMMY), KIND=C_SIZE_T))


SIZEOF is a non-standard extension. Some compilers could complain.
Standard in F2008 is STORAGE_SIZE, which gives you bits (not bytes!) and C_SIZEOF, which gives you bytes.

SIZEOF occurs in a few places in the GPU tree. @lukasm91 any strong feelings about switching to 8*STORAGE_SIZE(DUMMY)?

thanks for letting me know... for me, fortran is learning by doing - I simple didn't realize that this is not in the standard, so this is good to know and learning about the right way to do it :)

wdeconinck · 2024-10-08T10:30:28Z

src/trans/gpu/internal/ftdir_mod.F90

    PREEL_COMPLEX => PREEL_REAL
+#else
+    CALL ASSIGN_PTR(PREEL_COMPLEX, GET_ALLOCATION(ALLOCATOR, HFTDIR%HREEL_COMPLEX),&
+      & 1_C_SIZE_T, INT(KFIELD*D%NLENGTF*SIZEOF(PREEL_COMPLEX(1)),KIND=C_SIZE_T))


wdeconinck · 2024-10-08T10:30:37Z

src/trans/gpu/internal/ftinv_mod.F90

+    REAL(KIND=JPRBT) :: DUMMY
+
+#ifndef IN_PLACE_FFT
+    HFTINV%HREEL_REAL = RESERVE(ALLOCATOR, INT(D%NLENGTF*KF_FS*SIZEOF(DUMMY),KIND=C_SIZE_T))


wdeconinck · 2024-10-08T10:30:53Z

src/trans/gpu/internal/ftinv_mod.F90

    PREEL_REAL => PREEL_COMPLEX
+#else
+    CALL ASSIGN_PTR(PREEL_REAL, GET_ALLOCATION(ALLOCATOR, HFTINV%HREEL_REAL),&
+      & 1_C_SIZE_T, INT(KFIELD*D%NLENGTF*SIZEOF(PREEL_REAL(1)),KIND=C_SIZE_T))


and here..
Perhaps there are more instances in the code. This seems like a typical copy-paste-edit line.

… reverted once ecmwf-ifs#155 is merged

samhatfield mentioned this pull request Sep 19, 2024

Supporting AMD GPUs #125

Closed

samhatfield force-pushed the samhatfield/add_in_place_fft_ifdef branch from ce5c7cf to dea414f Compare September 19, 2024 14:40

lukasm91 reviewed Sep 20, 2024

View reviewed changes

src/trans/gpu/internal/ftdir_mod.F90 Outdated Show resolved Hide resolved

src/trans/gpu/internal/ftinv_mod.F90 Outdated Show resolved Hide resolved

src/trans/gpu/internal/ftinv_mod.F90 Outdated Show resolved Hide resolved

samhatfield force-pushed the samhatfield/add_in_place_fft_ifdef branch from dea414f to 3c6810a Compare September 20, 2024 10:43

samhatfield added enhancement New feature or request gpu labels Oct 1, 2024

samhatfield and others added 2 commits October 1, 2024 16:51

Add option to not do in-place FFT

697684d

This is currently disabled for cuFFT but enabled for hipFFT. In-place FFTs seem to be an issue for ROCm at the moment. This is a temporary workaround.

Fix case inconsistencies

89def35

Co-authored-by: lukasm91 <[email protected]>

samhatfield force-pushed the samhatfield/add_in_place_fft_ifdef branch from 1758aa1 to 89def35 Compare October 1, 2024 16:51

samhatfield mentioned this pull request Oct 2, 2024

Add support for AMD GPUs #157

Merged

wdeconinck approved these changes Oct 8, 2024

View reviewed changes

samhatfield mentioned this pull request Oct 8, 2024

Remove non-standard SIZEOF from gpu subtree #160

Closed

samhatfield merged commit 4f2fa50 into develop Oct 8, 2024
21 checks passed

samhatfield deleted the samhatfield/add_in_place_fft_ifdef branch October 8, 2024 14:57

wdeconinck added a commit to wdeconinck/ectrans that referenced this pull request Oct 23, 2024

Disable trans_delete and trans_finalize for GPU_VERSION. This must be…

224ca1d

… reverted once ecmwf-ifs#155 is merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make in-place FFT optional #155

Make in-place FFT optional #155

samhatfield commented Sep 19, 2024

samhatfield commented Sep 19, 2024

lukasm91 left a comment

wdeconinck Oct 8, 2024

samhatfield Oct 8, 2024

lukasm91 Oct 9, 2024

wdeconinck Oct 8, 2024

wdeconinck Oct 8, 2024

wdeconinck Oct 8, 2024

Make in-place FFT optional #155

Make in-place FFT optional #155

Conversation

samhatfield commented Sep 19, 2024

samhatfield commented Sep 19, 2024

lukasm91 left a comment

Choose a reason for hiding this comment

wdeconinck Oct 8, 2024

Choose a reason for hiding this comment

samhatfield Oct 8, 2024

Choose a reason for hiding this comment

lukasm91 Oct 9, 2024

Choose a reason for hiding this comment

wdeconinck Oct 8, 2024

Choose a reason for hiding this comment

wdeconinck Oct 8, 2024

Choose a reason for hiding this comment

wdeconinck Oct 8, 2024

Choose a reason for hiding this comment