One of the most common interview questions (and real-world decisions) is: “Which database should we use?” The answer is always “it depends.” But we can break it down into a practical framework.
There’s no single best database. Each type was built to solve a specific problem. Our job is to match the problem to the tool.
The Decision Framework
Ask these four questions:
1. What Does Our Data Look Like?
2. What Are Our Access Patterns?
- OLTP (lots of small reads/writes) — relational or document databases
- OLAP (complex analytics, aggregations) — columnar databases (ClickHouse, BigQuery, Redshift)
- Full-text search — Elasticsearch, Meilisearch, or PostgreSQL with GIN indexes
- Graph traversal — graph databases
- Simple lookups by key — key-value stores
3. What Consistency Do We Need?
- Strong consistency (every read sees the latest write) — PostgreSQL, MySQL, MongoDB (with write concern)
- Eventual consistency (reads might be slightly stale) — Cassandra, DynamoDB (default), Redis replicas
If we’re building a banking system, we need strong consistency. If we’re building a social media feed, eventual consistency is fine — nobody notices if a like count is off by 1 for a second.
4. How Big Will We Get?
- Vertical scaling (bigger server) — PostgreSQL, MySQL. Works well up to several TB and thousands of connections.
- Horizontal scaling (more servers) — Cassandra, DynamoDB, MongoDB (with sharding). Needed for truly massive scale.
Most applications never need horizontal scaling. PostgreSQL on a decent server can handle millions of users. Don’t over-engineer.
Real-World Examples
E-commerce Platform
Primary database: PostgreSQL
- Products, orders, users, payments
- Need ACID transactions for orders and payments
- Complex queries (search by category, price range, filters)
Cache layer: Redis
- Product catalog cache
- Session storage
- Cart data
- Rate limiting
Search: Elasticsearch
- Product search with typo tolerance
- Faceted filtering (brand, price, rating)
Social Media App
Primary database: PostgreSQL or MySQL
- User profiles, posts, follows
- Need strong consistency for auth and profiles
Activity feed: Redis (sorted sets)
- Pre-computed feed per user
- Fast reads for infinite scroll
Media metadata: MongoDB
- Variable metadata per media type (photo vs video vs story)
Friend suggestions: Neo4j or graph queries
- "People you may know" (friends of friends)
IoT / Monitoring System
Time-series data: Cassandra or TimescaleDB
- Millions of sensor readings per second
- Need massive write throughput
Recent data / alerts: Redis
- Current sensor values
- Alert thresholds and triggers
Configuration: PostgreSQL
- Device management
- User accounts
- Alert rules
Real-time Chat (like Discord)
Messages: Cassandra
- Billions of messages
- Partition by channel, cluster by timestamp
- Fast writes, fast range reads
Presence (online/offline): Redis
- Sub-millisecond lookups
- Pub/sub for real-time updates
User/server data: PostgreSQL
- User accounts, server settings, permissions
- Need ACID for role changes and permissions
The “Just Use PostgreSQL” Rule
Here’s a pragmatic take: start with PostgreSQL for almost everything. It handles JSON (like MongoDB), has full-text search (like Elasticsearch for simpler cases), supports pub/sub, and can handle more scale than most apps will ever need.
Add specialized databases only when PostgreSQL demonstrably can’t handle a specific requirement:
- Need sub-millisecond caching? Add Redis.
- Need to search millions of documents with typo tolerance? Add Elasticsearch.
- Need to write millions of events per second? Add Cassandra or Kafka.
- Need complex graph traversals at scale? Add Neo4j.
Don’t start with 5 databases on day one. Start with one and add others when the need is proven.
Quick Comparison
| Database | Type | Best For | Consistency | Scale |
|---|---|---|---|---|
| PostgreSQL | Relational | General purpose, OLTP | Strong | Vertical |
| MySQL | Relational | Web apps, read-heavy | Strong | Vertical |
| MongoDB | Document | Variable schemas, rapid dev | Tunable | Horizontal |
| Redis | Key-Value | Caching, sessions, real-time | Eventual* | Horizontal |
| Cassandra | Wide-Column | Time-series, high write throughput | Tunable | Horizontal |
| Neo4j | Graph | Relationship-heavy queries | Strong | Vertical |
| Elasticsearch | Search engine | Full-text search, analytics | Eventual | Horizontal |
| DynamoDB | Key-Value/Doc | Serverless, auto-scaling | Tunable | Horizontal |
| ClickHouse | Columnar | Analytics, OLAP | Eventual | Horizontal |
*Redis can be configured for strong consistency with Redis Sentinel/Cluster
In simple language, the right database depends on what we’re building. Start with PostgreSQL (it can do 80% of things well), add Redis for caching, and only bring in specialized databases when we have a proven need. The worst mistake is using 5 different databases on day one because they sound cool. Keep it simple until complexity is required.