-The HECToR upgrade to phase 2b (world's first production Cray XE6) presented a unique opportunity to demonstrate the benefit of shared-memory programming in real applications. The 24-core nodes were introduced to HECToR several months before the arrival of new Gemini interconnect. During the transitional period, communication intensive applications often produced more network traffic than the old SeaStar interconnect could handle. Fig.2 shows the benchmark of 2DECOMP&FFT's FFT interface with a 2592^3 problem size<a href="#note2" id="note2ref"><sup>2</sup></a>. With the slow SeaStar interconnect, the scaling is very poor when using more than several thousands of cores. However, switching on the shared-memory code significantly improved the code performance (sometimes by more than 40%) and a parallel efficiency of more than 90% was observed through out the scale. The new Gemini interconnect offers significant improvement in terms of both network bandwidth and latency. As a result, significant performance gain is to be expected for communication intensive code: the FFT benchmark was almost twice as fast in some cases. However, the shared-memory code on Gemini (not shown in the figure) offered absolutely no benefit when the network was fast enough to handle all the messages efficiently.
0 commit comments