Particle-in-cell simulations with charge-conserving current deposition on graphic processing units
We present an implementation of a 2D fully relativistic, electromagnetic particle-in-cell code, with charge-conserving current deposition, on parallel graphics processors (GPU) with CUDA. The GPU implementation achieved a one particle-step process time of 2.52 ns for cold plasma runs and 9.15 ns for extremely relativistic plasma runs, which are respectively 81 and 27 times faster than a single threaded state-of-art CPU code. A particle-based computation thread assignment was used in the current deposition scheme and write conflicts among the threads were resolved by a thread racing technique. A parallel particle sorting scheme was also developed and used. The implementation took advantage of fast on-chip shared memory, and can in principle be extended to 3D.