Query Optimization


Query Optimization
􀂄 Query optimization in parallel databases is significantly more complex
than query optimization in sequential databases.
􀂄 Cost models are more complicated, since we must take into account
partitioning costs and issues such as skew and resource contention.
􀂄 When scheduling execution tree in parallel system, must decide:
􀃌 How to parallelize each operation and how many processors to use for it.
􀃌 What operations to pipeline, what operations to execute independently in
parallel, and what operations to execute sequentially, one after the other.
􀂄 Determining the amount of resources to allocate for each operation is a
problem.
􀃌 E.g., allocating more processors than optimal can result in high
communication overhead.
􀂄 Long pipelines should be avoided as the final operation may wait a lot
for inputs, while holding precious resources

􀂄 The number of parallel evaluation plans from which to choose from is
much larger than the number of sequential evaluation plans.
􀃌 Therefore heuristics are needed while optimization
􀂄 Two alternative heuristics for choosing parallel plans:
􀃌 No pipelining and inter-operation pipelining; just parallelize every operation
across all processors.
􀂾 Finding best plan is now much easier --- use standard optimization
technique, but with new cost model
􀂾 Volcano parallel database popularize the exchange-operator model
– exchange operator is introduced into query plans to partition and
distribute tuples
– each operation works independently on local data on each
processor, in parallel with other copies of the operation
􀃌 First choose most efficient sequential plan and then choose how best to
parallelize the operations in that plan.
􀂾 Can explore pipelined parallelism as an option
􀂄 Choosing a good physical organization (partitioning technique) is
important to speed up queries.

Comments

Popular posts from this blog

Handling of Skew

Fragment-and-Replicate Join

USER INTERFACE DESIGN FOR ANNA UNIVERSITY SYLLABUS