[Lecture Notes] Query Execution & Processing

Devin Z
3 min readFeb 13, 2023

Lecture 6 of CMU 15–721 Advanced Databases (Spring 2023).

Milpitas, January 22, 2023

The Video Lecture¹

  • Andy Pavlo’s top 3 optimizations:
    - Data parallelization (vectorization)
    - Task parallelization (multi-threading)
    - Code specialization (compilation)
  • Optimization goals:
    - Reduce instruction counts.
    - Reduce cycles per instructions.
    - Parallelize execution.
  • The vectorization model is a middle ground between two extremes:
    - The iterator/volcano model, in which a child operator emits one tuple at a time.
    - The materialization model, in which a child operator emits all tuples.
  • The materialization model is better for OLTP workloads because intermediate results are typically small.
  • Most OLAP systems use the vectorization model because:
    - It reduces invocations per operator.
    - It allows for SIMD.
  • Two plan processing directions:
    - Top-down from root (the pull model).
    - Bottom-up from leaves (the push model).
  • Two approaches of parallel execution:
    - Inter-query parallelism
    - Intra-query parallelism
  • Two ways of intra-query parallelism:
    - Intra-operator parallelism (horizontal)
    - Inter-operator/pipelined parallelism (vertical)

The Monet/X100 Paper²

  • Hindrances to CPU pipelining:
    - Dependency between instructions
    - If-branch mis-predictions
  • Super-scalar CPUs can execute multiple independent instructions in parallel.
  • Compiler optimizations, such as loop pipelining, are critical for achieving good CPU utilization.
  • Cache hit-ratio of the memory loads and stores also has a big influence on the processor throughput.
  • Two ways that the Volcano model leads to low IPC (instructions-per-cycle) efficiency:
    - It inhibits compiler optimizations that exploit CPU parallelism.
    - The interpretation overhead cannot be amortized across multiple tuples.
  • Drawbacks to the full materialization of MonetDB:
    - Materializing results of expressions consumes extra memory bandwidth.
    - It introduces extra positional joins.
  • X100 strikes a balance between the Volcano and MonetDB execution models:
    - It uses a rather standard relational algebra as query language.
    - It uses vertically fragmented data layout to enable lightweight data compression, but reduces update costs by using append-only delta columns.
    - The use of vectorized primitives not only optimizes memory layout in the cache, but also limits degree of freedom so that aggressive loop pipelining could be applied.

The Scan-vs-Probe Paper³

  • Access path selection: when a query is predicated on a column with a secondary index, a secondary index scan may or may not work better than a full sequential scan.
  • Sequential scans are increasingly fast in modern optimized OLAP systems since:
    - Column storage avoids reading unneeded attributes.
    - Vectorized execution reduces tuple interpretation overhead.
    - Scans can be shared across queries.
    - Compressing individual columns independently is more efficient.
    - Holding each attribute contiguously allows for SIMD processing.
  • Secondary index scans are still useful when selectivity is low, but the break-even point decreases as more concurrent queries are run on the same data.
  • As the data size increases, the selectivity crossover point reaches a maximum and then drops gradually.
    - The cost of a sequential scan grows linearly.
    - For a secondary index scan, the tree height growth is sub-linear, but the final sort by row-id has a super-linear complexity.

--

--