Overlapping computation and communication of three-dimensional FDTD on a GPU cluster

Ki Hwan Kim, Q. Han Park

Research output: Contribution to journalArticlepeer-review

20 Citations (Scopus)


Large-scale electromagnetic field simulations using the FDTD (finite-difference time-domain) method require the use of GPU (graphics processing unit) clusters. However, the communication overhead caused by slow interconnections becomes a major performance bottleneck. In this paper, as a way to remove the bottleneck, we propose the 'kernel-split method' and the 'host-buffer method' which overlap computation and communication for the FDTD simulation on the GPU cluster. The host-buffer method in particular enables overlapping without any modifications to the update-kernels that are already in use. We also present theoretical formulas to predict the overlap threshold and the total throughput for each method. By using our overlap methods with 6 GPU nodes, we demonstrate that the total performance of 3D FDTD reaches 92% of a six-fold increase, which is the upper limit that would be reached if there were no communication overhead.

Original languageEnglish
Pages (from-to)2364-2369
Number of pages6
JournalComputer Physics Communications
Issue number11
Publication statusPublished - 2012 Nov


  • CUDA
  • FDTD
  • GPU cluster
  • OpenCL

ASJC Scopus subject areas

  • Hardware and Architecture
  • Physics and Astronomy(all)


Dive into the research topics of 'Overlapping computation and communication of three-dimensional FDTD on a GPU cluster'. Together they form a unique fingerprint.

Cite this