3-D data partitioning for 3-level perfectly nested loops on heterogeneous distributed systems
Abstract
Nested loops are the largest source of parallelism in many data-parallel scientific applications. Heterogeneous distributed systems are popular computing platforms for data-parallel applications. Data partitioning is critical in exploiting the computational power of such systems, and existing data partitioning algorithms try to maximize performance of data-parallel applications by finding a data distribution that balances the workload between the processing nodes while minimizing communication costs. This paper addresses the problem of 3-dimensional data partitioning for 3-level perfectly nested loops on heterogeneous distributed systems. The primary aim is to minimize the execution time by improving the load balancing and minimizing the internode communications. We propose a new data partitioning algorithm using dynamic programming, build a theoretical model to estimate the execution time of each partition, and select a partition with minimum execution time as a near-optimal solution. We demonstrate the effectiveness of the new algorithm for 2 data-parallel scientific applications on heterogeneous distributed systems. The new algorithm reduces the execution time by between 7% and 17%, on average, compared with leading data partitioning methods on 3 heterogeneous distributed systems. Copyright © 2016 John Wiley & Sons, Ltd.