Design Data-Intensive Applications

Updated 2025-07-03

R, S, Maintainability

R

adversity (misfortune) critical bugs due to poor error handling (part)faults vs (whole system )failures faulits tolerant, resilient system deliberately inducing faults -> Chaos monkey

weak correlation of failure

S

S -> grows a particular way load parameters

throughput, response time latency vs response time

occasional outliers p50, p95, p99

skews the measurement coping with load

rethink on every order of magnitude load increases

M

O, S, E

Data Models and Query Languages

relational, document, graph schema: enforced on write, handled on read

SQL: declarative lang vs imperative lang

Storage and Retrieval

log: append-only sequence of records index: affacts the performance of queries, updated on writes

SSTable: sorted string table LSM-tree: Log structured Merge-tree, faster for write B-tree: overwrite + WAL (write-ahead log for db crash), faster for read

write amplification SSD wear out performance cost, performance penalty throttle the rate of incoming writes

fuzzy query

in-memory database, Redis

Encoding and Evolution

Backward and forward compatibility

server side -> staged rollout client side -> may not install the update

Encoding vs Decoding: in-memory -> file

JSON, XML, CSV -> Binary encoding protocol buffer: tag index not field name, schema evolution

Data flow, through database service call: REST and RPC(location transparency, but don’t treat it a local call) message passing: asynchronous

Distributed Data

replication and partitioning: go hand in hand usually

Replication

Data is small that single machine can hold entire dataset

leader and followers synchronous and asynchronous: semi-synchronous

MySQL: binlog coordinates PG: log sequence number caught up

node outages: follower: caught-up recovery leader: failover

split brain

replication types:

  • statement based (error-prone)
  • WAL (physical or logical)
  • row-based (logical)
  • triggered-based (flexible)