Project: Krylov subspace techniques for solving medium-scale computational electromagnetics problems using the higher-order finite-element method on memory-constrained multiple-GPU systems

This project intends to propose, develop, and explore GPU-accelerated state-of-the-art iterative techniques for solving medium-scale (2–10 million unknowns) sparse real and complex-valued systems of equations or eigenproblems arising in electromagnetic research using the higher-order curvilinear finite-element method (FEM). The focus will be on improving Krylov-subspace algorithms to obtain better convergence and scalability, and on ways of overcoming the limitations of current GPUs associated with relatively small on-board memory resources. The latter objective will be attained by applying multiple accelerators, compact sparse matrix storage formats, and efficient data distribution techniques. This will allow the development of new, fast numerical procedures capable of solving problems exceeding a few million unknowns in size on a workstation or a low-end computational server. Solving problems this large is beyond the state-of-the-art in GPU computing and is impossible using a workstation where matrix factorization is carried out with multicore CPUs. Since the systems of linear equations involved are too large for direct factorization-based solvers, Krylov-subspace techniques will be considered, including multilevel preconditioned conjugate gradients for sparse systems with indefinite symmetric matrices and multiple right-hand sides. For generalized eigenvalue problems, the new ISIA (Inexact Shift-and-Invert Arnoldi) algorithm and LOBPCG (Locally Optimal Block Preconditioned Conjugate Gradient) method, with a novel inexact nullspace filtration technique, will be considered. All three algorithms will be investigated and redesigned if necessary for greater efficiency, memory savings, scalability, and use of GPU memory bandwidth. These problems form a part of the research program of NVIDIA’s CUDA Research Center for Computational Electromagnetics at Gdańsk University of Technology, one of just 101 worldwide that have been officially recognized as leading research groups in GPU computing technology. The end result will be to discover new methodologies and develop prototypes of numerical tools using multiple GPU accelerators which will fundamentally increase the scale and shorten the time associated with modeling and simulating a broad class of problems involving Maxwell’s equations.

To achieve this goal, the following techniques and concepts will be utilized:
  1. ISIA (Inexact Shift-Invert Arnoldi) Krylov-subspace method for solving complex symmetric, nonhermitian generalized eigenvalue problems, based on the concept of inexact shift-invert filtering with an extended basis (Matlab prototype code followed by a multiple-GPU implementation).
  2. Inexact nullspace filtration used jointly with LOBPCG Krylov-subspace algorithm for solving real-valued symmetric generalized eigenvalue problems (Matlab prototype code followed by a multiple-GPU implementation).
  3. A new discrete Laplace equation solver with multilevel preconditioning designed for computations on a single and multiple GPUs, used in inexact nullspace filtering.
  4. A GPU-based vector Helmholtz equation solver extended to work on multiple GPUs and optimized for handling multiple right-hand sides.
  5. Multiple-GPU sparse matrix-vector product computation generalized to simultaneously operate on multiple vectors (needed for LOBPCG).
  6. A new Sliced ELLR-T compact storage scheme for sparse matrices on GPUs, further developed for multiple-GPU usage by applying advanced domain decomposition through graph partitioning.