[Paper Notes] F1: A Distributed SQL Database That Scales

3 min readSep 26, 2022

A pioneer in HTAP (hybrid transactional/analytical processing).

Goals of F1’s design:
- Better scalability than AdWords’ old MySQL system.
- High availability for mission-critical business.
- Consistency, including ACID transaction support.
- Usability, including full SQL query support.
To mitigate increased latency due to remote data:
- Make data clustering explicit to improve locality.
- Make heavy use of batching, parallelism and asynchronous read.
Basic components:
- F1 servers, which receive requests and coordinate query execution.
- A shared slave pool that consists of F1 processes for executing parts of distributed query plans.
- The F1 master, which monitors slave process health and distributes the list of available slaves to F1 servers.
- Spanner servers, which retrieve data from the Colossus File System (CFS) in the same datacenter.

F1 servers are typically co-located in the same set of datacenters as the Spanner servers storing the data, but can communicate with Spanner servers outside their own datacenters when necessary.
F1 servers are mostly stateless except when holding locks for a pessimistic transaction.
Data model:
- Physically, F1 stores each child table clustered with and interleaved within the rows from its parent table to reduce the number of Spanner reads required for joining related data.
- The hierarchically clustered physical schema also reduces the number of Paxos groups involved in updates and thereby avoid the performance penalty of 2PCs.
- Support Protocol Buffer columns to remove the impedance mismatch between the database and applications.
- Allow for the use of repeated fields to avoid the performance impact and complexity of storing and joining multiple child records.
- Encourage local indexes, which are co-located with the root rows they index, over global indexes, which are sharded across directories.
F1 supports fully non-blocking schema changes by implementing a carefully designed algorithm.

Online, Asynchronous Schema Change in F1 - Google Research

research.google

Three types of F1 transactions:
- Read-only snapshot transactions, which is the default mode for SQL queries and MapReduces.
- Pessimistic transactions, which are directly mapped on to Spanner transactions and have to abort if F1 servers die.
- Optimistic transactions, each of which consists of (1) a long-running lock-free read phase followed by (2) a short-lived Spanner pessimistic transaction that checks conflicts and commits writes.
Optimistic transactions avoid conflicts between reads and writes, but have two drawbacks:
- Insertion phantoms, which may require the use of parent-table locks.
- Low throughput under high contention, which may require batching updates in the application level.

devinz1993.medium.com

F1 supports more granular locking than the default row-level locking to avoid transaction conflicts between independent updates.
F1 users may use change history for change data capture and incremental processing.
For each transaction, one ChangeBatch is written for each distinct root row and change history is stored close to the data being tracked as child tables.
F1 SQL supports both:
- Centralized execution of low-latency OLTP-style queries.
- Distributed execution of long-running OLAP-style queries.
In contrast with disk latency, network latency
- Can be mitigated by batching or pipelining data accesses.
- Benefit more from parallel data accesses because of less contention for limited resources.
A distributed query plan is organized as a DAG of plan parts rooted at the single query coordinator (or multiple partitioned consumers like MapReduces).
F1 cannot take advantage of explicit co-partitioning of data, and apply only hash partitioning for repartitioning.
For protocol buffers, F1 queries support:
- Using path expressions to extract individual fields.
- Querying and passing around entire protocol buffers.
- Access to repeated fields using PROTO JOIN.