[Lecture Notes] Rasterization Pipeline

4 min readApr 16, 2024

CMU 15–462: Introduction to Computer Graphics (Fall 2020)

Here are my notes taken from the video lectures of CMU 15–462: Introduction to Computer Graphics (fall 2020) ¹.

Drawing a Triangle

Rasterization: for each primitive (i.e. triangle), which pixels to light up?
- Extremely fast, but hard to achieve photorealism.
Ray tracing: for each pixel, which primitives are seen?
- Generally slow, but easier to get photorealism.
Rough sketch of a rasterization pipeline:
- Position objects in the world (3D transformations).
- Project objects onto the screen (perspective projection).
- Sample triangle coverage (rasterization).
- Interpolate triangle attributes at covered samples (barycentric coordinates).
- Sample texture maps or evaluate shaders (mipmapping).
- Combine samples into the final image (depth and transparency).
Computing triangle coverage:
- Input: projected positions of the triangle vertices.
- Output: a set of pixels covered by the triangle.
- Real scenes are complicated due to occlusion and transparency.
- Instead of doing exact computation, we resort to sampling and reconstruction.
Aliasing: high frequencies in the original signal masquerade as low frequencies after reconstruction due to undersampling.
Supersampling: take multiple samples per pixel and calculate average coverage.
Coarse-to-fine work in real graphics pipelines:
- Check if large blocks intersect the triangle for early-in and early-out.
- Test individual samples in the intersected blocks in parallel.

A transformation is characterized by the invariants it preserves.
Orthogonal transformations preserve the origin and distances.
- The inverse of the transformation is the transpose.
- It represents either a rotation, where the orientation is preserved …
- … or a reflection, where the orientation is reversed.
The spectral theorem indicates that a symmetric matrix performs a (non-uniform) scaling along some set of orthogonal axes.
Polar decomposition decomposes a matrix into
- an orthogonal matrix Q (i.e. rotation or reflection), and
- a symmetric positive-semidefinite matrix P (i.e. scaling).
An affine transformation (e.g. translation) in 2D can be represented by a linear transformation (e.g. shear) in 3D.
- A point has a non-zero homogeneous coordinate.
- A vector has a zero homogeneous coordinate.
A scene graph stores relative transformation in a directed graph.

2D rotations commute, but 3D rotations don’t.
Imaginary unit is just a quarter-turn in the counter-clockwise direction.
Complex multiplication amounts to angle addition and magnitude multiplication.
A quaternion is a pair of a scalar and a vector.
- It easily represents a rotation around an axis by some angle.
Quaternions enable spherical linear interpolation (SLERP).

Spherical Linear Interpolation

View frustum is the region the camera can see.
- Clipping eliminates primitives outside the view frustum.
- Near/far clipping is important for keeping finite-precision depth values accurate (z-fighting).
- The frustum is then (linearly) transformed into a unit cube.
3D linear interpolation:
- Find the affine function passing the vertices of the triangle.
- Alternatively, calculate the barycentric coordinates.
Texture coordinates define a mapping from surface coordinates to points in texture domain.
Texture aliasing happens when a single pixel on the screen covers many pixels of the texture.
- For magnification, just need to interpolate the value at the screen pixel center (via bilinear interpolation).
- For minification, need to calculate the average among neighboring texture pixels.
MIP map: store pre-filtered image at every possible scale, and look up each screen pixel from the appropriate level.
- The nearest integral level is calculated based on du/dx, dv/dx, du/dy, dv/dy.
- We further interpolate the value from the two neighboring integral levels (trilinear interpolation).
- For anisotropic filtering, it takes a further interpolation.

For each sample, the depth-buffer (a.k.a. z-buffer) stores the depth of the closest triangle seen so far.
- Depth is interpolated using barycentric coordinates.
- This doesn’t depend on the processing order of the primitives.
Opacity is represented as the value alpha.
Pre-multiplied alpha is closed under composition of the over operation.

Pre-multiplied Alpha

Combine depth and transparency:
- First process fully opaque primitives (with depth tests) in any order.
- Then process uncovered semi-transparent primitives from back to front.
GPUs are heterogeneous multi-core processors with fixed functions for rasterization.
- Recent GPUs (Nvidia RTX) have rendering functions baked in.