The CAP theorem is one of the most famous (and most misunderstood) concepts in distributed systems. It was proposed by Eric Brewer in 2000 and formally proved in 2002.
In simple language: in a distributed system, when a network partition happens, we have to choose between consistency and availability. We can’t have both.
The Three Letters
C — Consistency: Every read receives the most recent write. All nodes see the same data at the same time. If we write “balance = 500” and immediately read, we get 500 — not some stale value.
A — Availability: Every request receives a response (even if it’s not the latest data). The system never refuses to answer.
P — Partition Tolerance: The system continues to work even when network messages between nodes are lost or delayed. In a distributed system, network partitions will happen — it’s not a matter of “if” but “when.”
The Triangle
Why “Pick 2 out of 3”?
Here’s the key insight: partition tolerance isn’t really optional in a distributed system. Networks fail. Cables get cut. Data centers lose connectivity. It happens.
So the real choice is: when a partition happens, do we sacrifice Consistency or Availability?
- CP (Consistency + Partition Tolerance): During a partition, the system refuses to respond rather than return stale data. We get correct data or no data.
- AP (Availability + Partition Tolerance): During a partition, the system always responds, even if the data might be stale.
- CA (Consistency + Availability): This only works on a single node (no partitions possible). The moment we go distributed, we lose this option.
Where Do Popular Databases Fall?
CP — Consistency + Partition Tolerance
These databases will reject requests or become read-only during a partition rather than serve stale data.
- MongoDB (with majority write concern)
- HBase
- Redis Cluster (when a partition is detected, minority partition stops serving)
- etcd / ZooKeeper (consensus-based)
AP — Availability + Partition Tolerance
These databases always respond, even if the data might be temporarily inconsistent.
- Cassandra (tunable, but AP by default)
- DynamoDB
- CouchDB
- Riak
CA — Consistency + Availability (Single Node Only)
Not truly distributed. Works only when there’s no possibility of a network partition.
- Single-node PostgreSQL
- Single-node MySQL
- SQLite
A Practical Example
Imagine we have a distributed database with two nodes, Node A and Node B. A network cable between them gets cut (partition!).
A user writes balance = 500 to Node A. Now another user reads from Node B.
CP choice: Node B says “Sorry, I can’t serve this read because I haven’t synced with Node A. I might give you wrong data.” The system is consistent but unavailable.
AP choice: Node B says “Here’s the last value I have: balance = 300 (stale).” The system is available but inconsistent.
Neither is “wrong.” It depends on what we’re building.
-- CP behavior (MongoDB with majority read concern):
-- If the primary is unreachable, reads FAIL
-- db.accounts.find({id: "A"}).readConcern("majority")
-- Result during partition: Error - primary unavailable
-- AP behavior (Cassandra with consistency level ONE):
-- Reads succeed from any available node, even with stale data
-- SELECT balance FROM accounts WHERE id = 'A'; -- consistency ONE
-- Result during partition: Returns stale value (but doesn't fail!)
Common Misconceptions
”We have to give up one of the three permanently”
Nope. The trade-off only kicks in during a network partition. When everything is running smoothly (no partition), we can have all three. CAP is about what happens when things go wrong.
”CA databases exist in distributed systems”
Not really. Any multi-node system can experience a network partition. CA only works for single-node setups.
”CAP means NoSQL is better”
CAP doesn’t say one is better. It says distributed systems have trade-offs. Plenty of CP systems use SQL (CockroachDB, Google Spanner).
PACELC: The Extended Version
CAP only tells us what happens during a partition. But what about normal operation? The PACELC theorem extends CAP:
- Partition: choose A (availability) or C (consistency)
- Else (no partition): choose L (latency) or C (consistency)
For example, DynamoDB is PA/EL (available during partition, low latency normally). MongoDB is PC/EC (consistent during partition, consistent normally but with higher latency).
Interview Tip
CAP comes up in almost every system design interview. Don’t just recite the definition — give a concrete example like the two-node partition scenario above. Bonus points for mentioning that the trade-off only applies during a partition, and that PACELC gives us a more complete picture.