11th European Conference on Turbomachinery Fluid dynamics & Thermodynamics
This paper presents the serial optimization as well as the parallelization of the TRAF code, a 3D multi-row, multi-block CFD solver for the RANS/URANS equations. The serial optimization was carried out by means of a critical review of the most time-consuming routines in order to exploit vectorization capability of the modern CPUs preserving the code accuracy. The code parallelization was carried out for both distributed and shared memory systems, following the actual trend of computing clusters. Performance were assessed on several architectures ranging from simple multi-core PCs to a small slow-network cluster, and high performance computing (HPC) clusters. Code performance are presented and discussed for the pure MPI, pure OpenMP, and hybrid OpenMP-MPI parallelisms considering turbomachinery applications: a steady state multi-row compressor analysis and an unsteady computation of a low pressure turbine (LPT) module. Noteworthy, the present paper can provide code developers with relevant guidelines in the selection of the parallelization strategy without asking for a specific background in the parallelization and HPC fields.