SystemC is a widespread language for developing SoC designs. %Unfortunately, its simulation performance is heavily affected by a strictly sequential scheduler that slows down verification and time-to-market for new designs.Unfortunately, most SystemC simulators are based on a strictly sequential scheduler that heavily limits their performance, impacting verification schedules and time-to-market of new designs.Parallelizing SystemC simulation %requires a thoroughentails a complete re-design of the simulator kernel for the specific target parallel architectures. This paper proposes an automatic methodology to generate a parallel SystemC simulator kernel, exploiting the massive parallelism of GP-GPU architectures. Our solution leverages static scheduling to reduce synchronization overheads. The generated simulator code targets both \cuda\ and \opencl\ libraries, to boost scalability and provide support for multiple GP-GPU architectures. % We find experimentally that we achieve a compression of simulation time by one order of magnitude on these targets.Finally, the paper compares the performance of our solution on \cuda\ vs. \opencl\ platforms, with the goal of investigating advantages and drawbacks that the two thread management libraries offer to concurrent SystemC simulation.
SystemC simulation on GP-GPUs: CUDA vs. OpenCL
BOMBIERI, Nicola;VINCO, Sara;
2012-01-01
Abstract
SystemC is a widespread language for developing SoC designs. %Unfortunately, its simulation performance is heavily affected by a strictly sequential scheduler that slows down verification and time-to-market for new designs.Unfortunately, most SystemC simulators are based on a strictly sequential scheduler that heavily limits their performance, impacting verification schedules and time-to-market of new designs.Parallelizing SystemC simulation %requires a thoroughentails a complete re-design of the simulator kernel for the specific target parallel architectures. This paper proposes an automatic methodology to generate a parallel SystemC simulator kernel, exploiting the massive parallelism of GP-GPU architectures. Our solution leverages static scheduling to reduce synchronization overheads. The generated simulator code targets both \cuda\ and \opencl\ libraries, to boost scalability and provide support for multiple GP-GPU architectures. % We find experimentally that we achieve a compression of simulation time by one order of magnitude on these targets.Finally, the paper compares the performance of our solution on \cuda\ vs. \opencl\ platforms, with the goal of investigating advantages and drawbacks that the two thread management libraries offer to concurrent SystemC simulation.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.