High Performance Computation Of Compressible Flows On The Cray X1
Free (open access)
S. Tu, S. Aliabadi, A. Johnson & M. Watts
This paper reports our development of a fully parallelized and vectorized finite volume code for compressible flows on the Cray X1 parallel/vector/multi-streaming architecture. Our code is based on the Jacobian-free GMRES iterative solver with matrix-free Lower Upper-Symmetric Gauss-Seidel (LU-SGS) preconditioner. Two main vectorization techniques, the face coloring algorithm and the truncated Neumann expansions of the inverse of preconditioning matrices, are used to vectorize the long face loops and the LU-SGS preconditioner, respectively. Several numerical examples will be provided to demonstrate the performance of the new vectorized code on the Cray X1 compared with the old nonvectorized code on the Cray T3E-1200. Keywords: compressible flows, finite volume method, GMRES, LU-SGS, preconditioning, high performance computing, Cray X1, parallel, vectorization. 1 Introduction Parallel computing usingmultiple processors simultaneouslymakes computational scientists be able to solve larger problemswith less time. If the parallel computer is also equippedwith multistreaming and vector processors, the process of simulating and analyzing large problems will be further accelerated. The Cray X1 is such a supercomputer built using a hybrid parallel, vector, and multi-streaming design. The basic component of the Cray X1 is the Multi- Streaming Processor (MSP), which is a multi-module chip composed of four Single-Streaming Processor (SSP) modules and four cache modules. The MSP is the user-addressable processing element, and the division of work (i.e. streaming) of the four SSP modules is directed by the compiler (however, the user does have
compressible flows, finite volume method, GMRES, LU-SGS, preconditioning, high performance computing, Cray X1, parallel, vectorization.