A closed-cycle gasoline compression ignition engine simulation near top dead center (TDC) was used to profile the performance of a parallel commercial engine computational fluid dynamics code, as it was scaled on up to 4096 cores of an IBM Blue Gene/Q supercomputer. The test case has 9 million cells near TDC, with a fixed mesh size of 0.15 mm, and was run on configurations ranging from 128 to 4096 cores. Profiling was done for a small duration of 0.11 crank angle degrees near TDC during ignition. Optimization of input/output performance resulted in a significant speedup in reading restart files, and in an over 100-times speedup in writing restart files and files for post-processing. Improvements to communication resulted in a 1400-times speedup in the mesh load balancing operation during initialization, on 4096 cores. An improved, “stiffness-based” algorithm for load balancing chemical kinetics calculations was developed, which results in an over 3-times faster run-time near ignition on 4096 cores relative to the original load balancing scheme. With this improvement to load balancing, the code achieves over 78% scaling efficiency on 2048 cores, and over 65% scaling efficiency on 4096 cores, relative to 256 cores.

This content is only available via PDF.
You do not currently have access to this content.