Similarities and Differences Between Parallel Systems and Distributed Systems Part 1
This post part 1 from a series of posts that are extracted from a technical report i wrote for a research study and Indiana University. The full paper can be found in the digital science center publications page, or you can get the pdf link here. I thought of posting the content here so more people can access the content which i think might be helpful to some. I am breaking this down into several posts because of the length of the document. The content is based on my knowledge and the current technologies as of day of writing the report and i might not have got everything correct. Please point out what you think in the comments if you think i have got something wrong, Will try to respond and update the document if needed. Will also include the list of references in each post for completeness.
This post will give the introduction and compare and contrast the two domains with regards to the following
This post will give the introduction and compare and contrast the two domains with regards to the following
- Fault tolerance
- Support of collectives
- Dynamic resources utilization
- Communication protocols
- Level of abstraction
Introduction
In order to identify similarities and differences between parallel systems and distributed systems, comparisons were done on features and functionality of MPI, Charm++, HPX (parallel) and Spark, Flink (distributed) frameworks. This report lists important features and functionality of such frameworks and discusses each of them. This is by no mean a complete list, there may be many other important features that can be compared and discussed.
Fault tolerance
One of the main advantages that distributed frameworks such as Spark and Flink have over parallel frameworks such as MPI is the inherent support for fault tolerance. Most distributed frameworks are designed with fault tolerance as one of the main objectives. Which is why fault tolerance is supported from the very basic level of abstraction. For example RDD’s[7] in Spark which is its basic abstraction layer supports fault tolerance.
On the other hand parallel frameworks like MPI do not provide inherent support for fault tolerance. Many projects have worked on adding Fault tolerance support to parallel frameworks but none of the seem to be as seamless as in the case of distributed frameworks where the algorithm developer does not have to consider anything regarding fault tolerance when implementing the algorithm. Most of the fault tolerance support created around MPI are based on checkpoint restart model, Run through stabilization[1] was another fault tolerance method proposed by the fault tolerance working group. Charm++ has double checkpoint-based fault tolerance support[2] which performs check pointing in-memory.
Support of collectives
Parallel frameworks provide a more rich set of collective operations than distributed frameworks. Distributed frameworks mainly provide support for broadcast, scatter, gather and reduce operations. Parallel frameworks provide a much more richer set of collectives such as allReduce, allGather. These collectives are implemented in an highly efficient manner in parallel frameworks such as MPI and HPX. Frameworks such as Spark and Flink have very limited communication between worker nodes other than broadcast variables and accumulators. Some other frameworks such as Harp do provide such collectives on top of Hadoop. Because of the absence of such collectives it is not possible to program algorithms in BSP model, which is a very efficient model to use for some algorithm implementations.
Dynamic resources utilization
Another major area of concern for parallel and distributed frameworks is dynamic resource utilization. It is important that frameworks are able to scale up and scale down the number of resources that are used with the load of work that is done. Distributed systems are well adept for this and are able to dynamically scale with the workload and the amount resources (number of nodes) that are available. Parallel frameworks such as HPX and Charm++ do also support dynamic resource allocation but available implementations in MPI do not provide such functionality.
Communication protocols
Parallel systems such as MPI, HPX and Charm++ support high end communication protocols such as Infiniband and GEMINI in addition to Ethernet. Which gives parallel frameworks the ability to leverage high end performance from the networks. On the other hand distributed systems such as Spark, Flink only support ethernet by default. However there are several research projects that work providing such support to distributed frameworks[1]
Level of abstraction
Another important aspect of distributed and parallel frameworks are the level of abstraction provided to the user ( algorithm developer ). In this regard distributed systems provide a higher level of abstraction to users. Users can think of data structures such as arrays as distributed arrays. Parallel frameworks such as MPI provide a lower level of abstraction. But frameworks such as Charm++ and HPX do provide some higher level constructs. It is also noteworthy that MPI IO and MPI collectives can be considered to be higher level abstractions.
Higher level and lower level abstraction both have pros and cons. Higher level abstractions allows the user to develop algorithms without having to worry about low level details, which makes algorithm development more easy. With lower level abstractions the user has more control over the system and can obtain more performance from the system.
References
[1] Hursey, Joshua, et al. "Run-through stabilization: An MPI proposal for process fault tolerance." European MPI Users' Group Meeting. Springer Berlin Heidelberg, 2011.
[2] Parallel Programming Laboratory: Fault Tolerance Support, http://charm.cs.uiuc.edu/research/ft, accessed Nov 20 2016.
[3] High-Performance Big Data (HiBD), http://hibd.cse.ohio-state.edu/, accessed Nov 14 2016.
[4] Fox, Geoffrey C., et al. "Towards an understanding of facets and exemplars of big data applications." Proceedings of the 20 Years of Beowulf Workshop on Honor of Thomas Sterling's 65th Birthday. ACM, 2014.
[5] Tikotekar, Anand, Chokchai Leangsuksun, and Stephen L. Scott. "On the survivability of standard MPI applications." LCI International Conference on Linux Clusters: The HPC Revolution. 2006.
[6] Bronevetsky, Greg, et al. "Automated application-level checkpointing of MPI programs." ACM Sigplan Notices. Vol. 38. No. 10. ACM, 2003.
[7] Zaharia, Matei, et al. "Resilient distributed datasets: A fault-tolerant abstraction for in-memory cluster computing." Proceedings of the 9th USENIX conference on Networked Systems Design and Implementation. USENIX Association, 2012.
[8] Elnozahy, Elmootazbellah Nabil, et al. "A survey of rollback-recovery protocols in message-passing systems." ACM Computing Surveys (CSUR)34.3 (2002): 375-408.
[9] Carbone, Paris, et al. "Apache flink: Stream and batch processing in a single engine." Data Engineering (2015): 28.
[10] Hursey, Joshua, et al. "Run-through stabilization: An MPI proposal for process fault tolerance." European MPI Users' Group Meeting. Springer Berlin Heidelberg, 2011.
[11] Reyes-Ortiz, Jorge L., Luca Oneto, and Davide Anguita. "Big data analytics in the cloud: Spark on hadoop vs mpi/openmp on beowulf." Procedia Computer Science 53 (2015): 121-130.
[12] Jha, Shantenu, et al. "Introducing distributed dynamic data-intensive (d3) science: Understanding applications and infrastructure." (2014).
[13] Jha, Shantenu, et al. "Distributed computing practice for large‐scale science and engineering applications." Concurrency and Computation: Practice and Experience 25.11 (2013): 1559-1585.
Comments
Post a Comment