Parallel Sort


Parallel Sort
Range-Partitioning Sort
􀂄 Choose processors P0, ..., Pm, where m ≤ n -1 to do sorting.
􀂄 Create range-partition vector with m entries, on the sorting attributes
􀂄 Redistribute the relation using range partitioning
􀃌 all tuples that lie in the ith range are sent to processor Pi
􀃌 Pi stores the tuples it received temporarily on disk Di.
􀃌 This step requires I/O and communication overhead.
􀂄 Each processor Pi sorts its partition of the relation locally.
􀂄 Each processors executes same operation (sort) in parallel with other
processors, without any interaction with the others (data parallelism).
􀂄 Final merge operation is trivial: range-partitioning ensures that, for 1 j
m, the key values in processor Pi are all less than the key values in Pj.

Parallel Sort (Cont.)
Parallel External Sort-Merge
􀂄 Assume the relation has already been partitioned among disks
D0, ..., Dn-1 (in whatever manner).
􀂄 Each processor Pi locally sorts the data on disk Di.
􀂄 The sorted runs on each processor are then merged to get the
final sorted output.
􀂄 Parallelize the merging of sorted runs as follows:
􀃌 The sorted partitions at each processor Pi are range-partitioned
across the processors P0, ..., Pm-1.
􀃌 Each processor Pi performs a merge on the streams as they are
received, to get a single sorted run.
􀃌 The sorted runs on processors P0,..., Pm-1 are concatenated to get
the final result.

Comments

Popular posts from this blog

Handling of Skew

Fragment-and-Replicate Join

USER INTERFACE DESIGN FOR ANNA UNIVERSITY SYLLABUS