WiComm's GPU Research Center

 

WiComm's GPU Research Center for Computational Electromagnetics
at Gdansk University of Technology

Since February 2012 Gdansk University of Technology has joined the elite group of research Institutions identified by NIVIDIA as CUDA Research Centers. GPU Research Centers are doing world-changing research by leveraging CUDA and NVIDIA GPUs, The Centers are selected by NVIDIA and are regarded as research institutions which are at the forefront of some of the world's most innovative scientific research.

The GPU Research Center at Gdansk University of Technology has been established based on a core team of researchers from prof. Michal Mrozowski's computational electromagnetics and photonics group, who have substantial experience in CUDA and GPGPU computing. The research group is a part of Wireless Communication Engineering (WiComm) Center of Excellence and the Department of Microwave and Antenna Engineering (DMAE) of the Faculty of Electronics, Telecommunications and Informatics (ETI).

The group conducts research on GPU acceleration of the computational electromagnetics methods. The group has the scientific potential on the highest international level evidenced by the publication of over 120 papers in the ISI journals with high impact factor and scientific collaboration with the leading research groups around the world. The DMAE/WiComm conducts the world's leading research on GPU computing in the area of FDTD and FEM in electromagnetics. Right now, there are several research projects aimed at usage of the GPU acceleration in science, which involve about 10 researchers and students.

The goal of WiComm's GPU Research Center is to further strengthen the research potential at GUT in the area of GPU computing, improving its depth and broadening its scope, and eventually leading to wider adoption of CUDA-based architecture and hardware in computational prototyping and design automation in high-impact areas of electrical engineering, such as high frequency electromagnetics. Over the last few years GPU computing at prof. Mrozowski's group has led to significant advances in numerical simulation of electromagnetic phenomena using CUDA technology. WiComm's staff and researchers have developed a GPU computing infrastructure to support these research activities and the Department of Microwave and Antenna Engineering and several other Departments of the Faculty of Electronics, Telecommunications and Informatics are in the process of further extending research and educational activities in GPU computing.

The computational electromagnetics and photonics group has many years of experience in High Performance Computing and for the last five years has been engaged in a number of research and development activities that leverage CUDA GPU computing. The number and scope of group's projects using CUDA and researchers involved in GPGPU computing, and its focus on exploiting GPU architectures has been constantly growing, as has the number of its papers related to CUDA research published in journals of the highest standing. The following facts should be highlighted:

  • The DMAE and Wireless Communication Engineering Centre of Excellence - WiComm form a part of the ETI Faculty. The WiComm Center is considered to be among 100 top research groups in Poland

  • Professor Mrozowski who is the head of WiComm and DMAE is a renowned expert in the field of computational electromagnetics, and for his research in this area received the title of Fellow of IEEE.

  • Prof. Mrozowski's group was one of the first groups in the world to recognize the potential of GPUs in computations [7]. The first government funded research project in this area entitled "Mathematical modeling of photonic crystals and photonic fibers using macromodels and a customized mini-clusters with hardware acceleration" was carried out within DMAE/WiComm in the period 2007-2009. It was probably the first project devoted to GPU computing in Poland, and one of the first focused on computational electromagnetics in the world. The project resulted in Adam Dziekonski's MSc thesis and several papers including "How to Render FDTD Computations More Effective Using a Graphics Accelerator" [7] - this paper is often cited by other researchers and its results are considered to be a benchmark for FDTD computations

  • The CUDA related research of the group was recognized by IEEE MTT Society with the 2009 fellowship awarded to Mr Adam Dziekonski - a graduate student of DMAE for research on "GPU Computing with CUDA for Matrix-Based Computational Electromagnetics."

  • Research results were also recognized by NVIDIA. Early results in this area were posted by NVIDIA's Sumit Gupta, Sr. Product Manager - Tesla GPU Computing on the NVIDIA's web site [9].

  • In December 2011 NVIDIA reviewed group's NVIDIA Academic Partnership application, and decided to support prof. Mrozowski's work with the donation of two C2075s.

  • Since February 2012 the group's research is featured at NVIDIA's research web page http://research.nvidia.com/content/gdanskunivtech-crc-summary

  • In a many aspects related to GPU computing the group is a world leader (examples include CUDA accelerated parallel sparse matrix iterative solvers with multilevel preconditioners for FEM).

  • The group developed a new sparse matrix storage format called Sliced ELLR-T which delivers about 70Gflops in complex arithmetic on a FERMI GPU with only minimal memory footprint overhead over CSR sparse matrix storage format. This is the best performance on the Fermi architecture which has been achieved so far. The code for the new SpMV (sparse matrix-vector product) was made publicly available for research and educational purposes and can be downloaded from http://mwave.eti.pg.gda.pl/~adziekonski/downloads.html

  • In the course of its research the group has effectively developed a testbench of challenging cases which are relevant for designers and manufacturers of high-frequency electromagnetic equipment, who are in turn important consumers of electronic design automation (EDA) software and high performance computing hardware.

  • There are 3 on-going projects related to GPGPU computing with 10 researchers involved. The group has been also involved in projects with other European universities

  • The group has worked with a variety of NVIDIA's hardware including GeForce 8600 GT, QuadroFX 5600, GTX 285, GTX480, GTX580, GTX590 and Tesla's C2075s.

A substantial part of research activities of the group include GPU computing [1]-[8], higher order finite element method (FEM) with curvilinear elements [1]-[6], multilevel preconditioners for fast solution of sparse systems of equations [1]-[3]. Finite element techniques are being developed in parallel with sparse matrix iterative solvers with multilevel preconditioners and GPU acceleration [1]-[6]. In this respect WiComm/DMAE is again the world leader.

On-going projects related to GPGPU CUDA computing

(I) Leader - Fast design of microwave filters and multiplexers with use of the 3D electromagnetic solvers for novel systems of the wireless communication, 2010-2013, Principal Investigator: Dr. Adam Lamecki - The project deals with the development of fast numerical algorithms for solving time harmonic Maxwell's equation using various HPC architectures including clusters, multicore processors as well as GPUs. The focus is on a multilevel FEM with adaptive mesh refinement. So far with a single GPU we have obtained 4X speedup over a six-core Xeon CPU. The current focus is on multi-GPU acceleration and domain decomposition techniques as well as obtaining speedup in finite element matrix generation and assembly.

(II) EUREKA PROJECT 5071 MWAVE_CAD - Fast computer aided synthesis and design of microwave filters and multiplexers, 2010-2014, Principal Investigator: Prof. Michal Mrozowski - The goal of this project is to develop together with a commercial partner (Mician Gmbh from Germany) a more efficient software for microwave CAD. GPU is used here for solving sparse and dense linear systems as well as for accelerating the solution of eigenvalue problems.

(III) Homing Plus - Advanced Simulation Methods for Electromagnetic Exposure Assessment, 2011-2013, Principal Investigator: Dr. Tomasz Stefanski - The purpose of this project is to develop new techniques of the large scale electromagnetic simulations for exposure assessment. FDTD simulation is explored in the context of the heterogeneous parallel processing on multi-core CPU and GPUs.

The CUDA and OpenCL technologies are deployed in these projects to bring new cross-disciplinary approach to the research on electromagnetism and HPC. in the period 2009-2011 our research was co-funded by European Science Foundation within the framework of the international project COST IC 0605 ASSIST. WiComm's/DMAE's role in this project was to investigate the potential of multicore and GPU computing in context of antenna design using numerical techniques such as multilevel FEM and the method of moments (MoM). Within this project GPU implementations of iterative and direct solvers for systems of equation were considered. In this research we used GTX 285 [1] and GTX 480 [2] to accelerate the solution of large sparse system of equations involving over 1 mln unknowns in real and complex arithmetic. To our best knowledge papers [1]-[2] are the first papers on multilevel FEM solvers with GPU acceleration. The goal is to increase the throughput while at the same time reducing the memory requirements in order to allow one to process very large complex or real systems in single and double precision on GPUs. We achieved a 3X memory footprint reduction using a new format of storing sparse matrices.. Research in the period 2006-2009 conducted by Dr. Stefanski at the University of Glasgow aimed at development of the parallel alternating direction implicit finite-difference time-domain (ADI-FDTD) full-wave electromagnetic solver [10]-[12] . During that time the world's first parallel implementation of the ADI-FDTD method on GPU was developed [12]. Other researchers refer to this result in the context of the GPU implementations of the implicit FDTD schemes. Research in the period 2009-2011 conducted by Dr. Stefanski at the Swiss Federal Institute of Technology (ETH Zurich) aimed at development of the parallel electromagnetic solvers with use of OpenCL [13-14]. Two projects were accomplished during that time: "Solver Portable between Modern HPC Architectures" and "Acceleration of the FDTD method on the GPU cluster". The FDTD solver portable between GPU and CPU architectures was developed and further extended with the Message Passing Interface for execution on the distributed memory computer clusters (also GPU accelerated). These research results were commercialized by the industrial partner and have been or soon will be introduced to the market.

Current research at WiComm's GPU Research Center

With the established record of CUDA research leadership related to computational electromagnetics in the past 5 years, 10 research staff involved in CUDA computing, and a portfolio of three on-going research projects GPU Research Center at WiComm/DMAE intends to further provide innovative solutions to address fast growing computational needs for the crucial electromagnetic segment of Electronic Design Automation market. To this end the group focuses its research on:

  • multi-GPU computing and algorithms requiring large memory resources;

  • algorithms requiring double and multiple precision arithmetic.

In particular, the work related multiple precision computations is intended for fast computations of wave propagation using the Discrete Green's Function in time domain. The numerical implementation of this technique entails severe problems related to numerical precision because large binomial coefficients have to be calculated. Calculation of binomial coefficients involves evaluation of factorials of large numbers. At the moment symbolic algebra packages such as Mathematica or Maple are used for this purpose and the computations take many hours. Using multiple precision arithmetic on a GPU should drastically reduce this time.

As far as multi-GPU and large memory computations the work is related to higher order Finite Elements Methods in computational electromagnetics and photonics. One line of investigation will be GPU acceleration of sparse eigenvalue solvers. So far no papers have been published in this area thus it is believed that WiComm's GPU Research Center is a world leader in this area. In this type of problems high throughput in double precision and large memory is essential, In computational electromagnetics one is usually interested in eigenvalues with the smallest magnitude. For this purpose one applies a Krylov space methods (such as Lanczos or Jacobi-Davidson) in the shift-and-invert mode. Krylov space is created using vectors obtained by successive solutions of large ill-conditioned systems. Each vector has to be stored and all vectors should be orthogonal. These features indicate that eigenvalue problems are more challenging for GPU computing than simple sparse systems of linear equations which received considerable attention form the GPGPU community. If a GPU is to be used for accelerating the solution of large eigenvalue problems double precision is necessary to maintain the orthogonality of vectors spanning the Krylov space and fast memory has to be large enough to store all these vectors. In this context also other aspects of finite element methods are being investigated such as multi-GPU acceleration for iterative solvers and domain decomposition approach to FEM as well as setting up large FEM matrices.

The second focus is a GPU implementation of model order reduction (MOR) algorithms. MOR is a method used in analysis of dynamical systems represented in state-space form (as a system of many time-domain ordinary differential equations, or differential algebraic equations). In the case of computational electromagnetics a system dynamics is represented by partial differential equations which are discretized by means of FEM MOR allows one to compress this system significantly but requires a generation of orthogonal projection basis. Each vector of this basis is found by solving a large system of equations. As a result MOR leads to similar challenges as the generalized eigenvalue problems because the projection basis has to be orthogonal (which calls for double precision) and both the basis and the matrix have to be stored. With the number of unknowns exceeding several millions the matrices and a set of basis vectors are too big to fit into on-board RAM of a commodity GPU.

References

[1] A. Dziekonski, A. Lamecki, M. Mrozowski, "GPU Acceleration of Multilevel Solvers for Analysis of Microwave Components With Finite Element Method," IEEE MWC Lett., vol.21, no.1, pp.1-3, Jan. 2011 (getPDF)

[2] A. Dziekonski, A. Lamecki, M. Mrozowski, "Tuning a Hybrid GPU-CPU V-Cycle Multilevel Preconditioner for Solving Large Real and Complex Systems of FEM Equations," IEEE Antennas and Wireless Propagation Letters, vol.10, pp.619-622, 2011. (getPDF)

[3] A. Dziekonski, A. Lamecki, M. Mrozowski, "A Memory Efficient and Fast Sparse Matrix Vector Product on a GPU", Progress In Electromagnetics Research, vol. 116, pp. 49-63, 2011. (getPDF)

[4] A. Dziekonski, A. Lamecki, M. Mrozowski, "Jacobi and Gauss-Seidel preconditioned complex conjugate gradient method with GPU acceleration for finite element method," 2010 European Microwave Conference (EuMC), pp.1305-1308, 28-30 Sept. 2010

[5] A. Dziekonski, M. Mrozowski, "Tuning matrix-vector multiplication on GPU," 2nd International Conference on Information Technology (ICIT), pp.167-170, 28-30 June 2010

[6] A. Dziekonski, M. Mrozowski, "Krylov space iterative solvers on graphics processing units," 18th International Conference on Microwave Radar and Wireless Communications (MIKON), pp.1-4, 14-16 June 2010

[7] P. Sypek, A. Dziekonski, M. Mrozowski, "How to Render FDTD Computations More Effective Using a Graphics Accelerator," IEEE Trans. Magnetics, vol.45, no.3, pp.1324-1327, March 2009

[8] A. Dziekonski, P. Sypek, L. Kulas, M. Mrozowski, "Implementation of matrix-type FDTD algorithm on a graphics accelerator," 17th International Conference on Microwaves, Radar and Wireless Communications, pp.1-4, 19-21 May 2008

[9] http://www.nvidia.com/object/cee.html

[10] T. P. Stefanski, T. D. Drysdale, "Parallel Implementation of the ADI-FDTD Method", Microwave and Optical Technology Letters, Wiley, vol. 51, pp. 1298-1304, May 2009

[11] T. P. Stefanski, T. D. Drysdale, "Parallel ADI-BOR-FDTD Algorithm", IEEE Microwave and Wireless Components Letters, vol. 18, pp. 722-724, November 2008

[12] T. P. Stefanski, T. D. Drysdale, "Acceleration of the ADI-FDTD Method Using Graphics Processor Units", IEEE International Microwave Symposium, Boston, MA, pp. 241-244, 7-12 June 2009

[13] J. I. Toivanen, T. P. Stefanski, N. Kuster, N. Chavannes, "Comparison of CPML Implementations for the GPU-Accelerated FDTD Solver", PIER M, vol. 19, pp. 61-75, 2011

[14] T. P. Stefanski, N. Chavannes, N. Kuster, "Multi-GPU Accelerated Finite-Difference Time-Domain Solver in Open Computing Language", PIERS Online, vol. 7, no. 1, pp. 71-74, 2011

[15] T. P. Stefanski, N. Chavannes, N. Kuster, "Multi-GPU Accelerated Finite-Difference Time-Domain Solver in Open Computing Language", Proceeding of PIERS, Marrakesh, 20-23 March 2011

Prof. Michal Mrozowski received the M.Sc. degree in Radiocommunication Engineering and PhD in Electronic Engineering, both with first class honors, from the Gdansk University of Technology in 1983 and 1990, respectively. In 1986 he joined Department of Electronics, Gdansk University of Technology where he is now a Full Professor, vice-dean for Research, Head of the Department of Microwave and Antenna Engineering and the Director of Center of Excellence for Wireless Communication Engineering. His research interests are concerned with computational electromagnetics and photonics. His current work is focused on the development new fast numerical techniques for solving 2D and 3D boundary value problems in time and frequency domain using multicore architectures and GPU units, automated microwave filter design, reduced order models for grid based numerical techniques (eg. FDTD and FEM), surrogate model construction and SPICE model generation Prof. M. Mrozowski is a Fellow of IEEE, a member of MTT-1 (CAD) and MTT-15 (Field Theory) Technical Committees, and a Fellow of the Electromagnetics Academy. Prof. Mrozowski is a past chairman of the Polish AES/AP/MTT Chapter and in 2004-2005 he served as Associate Editor for IEEE Microwave and Wireless Components Letters. He published one book and over 70 peer reviewed papers in IEEE journals. He has developed several modules that were then integrated into commercial microwave CAD software used all over the world.

Copyright by Wireless Communication Engineering (WiComm) Center of Excellence.