Abstract:
The importance of distributed memory auto-parallelization in utilizing high performance
computers is enormous. However, past works on distributed memory auto-parallelization
have excessive communication overhead problem, which degrades performance of the
parallelized program. The problem with communication overhead is due to unnecessary,
large volume and frequent data communications among processors.
This paper presents a new communication model that addresses the problem with communication
overhead. The communication model uses a combination of techniques to
minimize the communication overhead. In particular, we designed a data dependency,
data-flow and symbolic range based technique to precisely determine set of communication
data. We implemented a prototype of the proposed communication model in a
popular parallelization compiler infrastructure called Cetus.
We evaluated the prototype against a state-of-the-art work using a set of PolyBenchs
on a 5-node Raspberry-pi cluster. The experiments are organized into three groups that
aim to evaluate communication overhead, correlation between communication overhead
and problem size, and accuracy. The results show that the proposed communication
model reduces communication overhead by an average of 48% when compared to the
state-of-the-art approach with an accuracy of 100%. Analysis of the correlation between
communication overhead and problem size indicates that the communication overhead in
the proposed communication model has strong negative correlation with the problem size
compared to the state-of-the-art approach. In general, from the experiment results, it
can be concluded that the proposed communication model mitigates the communication
overhead problem in distributed memory auto-parallelization.