Lesson 01
How does bigquery separate storage and compute and why does this matter for sclaling
BigQuery splits the database into two independent layers connected by a fast network. Storage layer (Colossus) Data lives in Google's distributed file system, stored as compressed columnar files (Capacitor format). Each table is sharded across thousands of disks, replicated for durability. Storage is billed per GB-month — independent of whether anyone is querying. Compute layer (Dremel) Queries run on a shared pool of workers called slots. When you submit a query, BigQuery dynamically allocates slots, reads the needed columns from Colossus over the Jupiter network (petabit-scale), executes in a tree of aggregators, and releases the slots. Compute is billed per TB scanned (on-demand) or per slot-hour (reservations).
Get the full lesson
Sign in to unlock everything beyond the preview — it's free.
- Take timestamped notes as you watch
- Read the full transcript and download resources
- Join the discussion and track your progress