RDB (Redis Database) is Redis’s snapshot-based persistence. Every so often, Redis dumps the entire dataset to a single binary file called dump.rdb. When Redis restarts, it loads that file back into memory.
In simple language — think of it like saving a video game. You play for a while, hit “save”, and the game writes everything to disk. If your computer crashes, you reload the save file and continue. But any progress between the last save and the crash is lost.
Why use RDB?
- Compact — single binary file, easy to back up, easy to ship to another machine.
- Fast restarts — loading an RDB file is faster than replaying a log.
- Minimal runtime cost — the main process barely does any work for the snapshot itself.
The catch — if Redis crashes between snapshots, you lose everything written after the last snapshot.
How it works — BGSAVE and fork()
When Redis decides to snapshot, it calls BGSAVE. Here’s the trick — Redis uses fork() to create a child process. The child writes the snapshot. The parent keeps serving commands.
fork() uses copy-on-write (COW). The child gets a “copy” of memory, but the OS doesn’t actually duplicate the RAM. It only copies pages that get modified. So if your dataset is 10 GB and writes are slow, the snapshot barely uses any extra memory.
keeps serving
GET / SET / DEL
clients never block
walks shared memory
writes dump.rdb
exits when done
Triggering snapshots
Three ways — automatic (config), manual (commands), or on shutdown.
# redis.conf — save if condition met
save 3600 1 # after 3600s (1h) if at least 1 key changed
save 300 100 # after 300s if at least 100 keys changed
save 60 10000 # after 60s if at least 10000 keys changed
Manual commands:
SAVE # blocking — main process does the snapshot. Avoid in prod.
BGSAVE # non-blocking — fork() child writes. Use this.
LASTSAVE # unix timestamp of last successful save
Tradeoffs
| Pro | Con |
|---|---|
| Compact single file | Data loss between snapshots |
| Fast restart from snapshot | fork() can stall on huge datasets |
| Great for backups / replication | No fine-grained recovery point |
If your dataset is 50 GB on a memory-pressured box, fork() itself can hiccup because the OS has to set up page tables. That’s a real concern at scale.
When RDB alone is fine
- You use Redis as a cache — losing the last few minutes is no big deal, you’ll repopulate from the source of truth.
- You take periodic backups and ship
dump.rdbto S3. - You want fast restarts and don’t need durability for every write.
If you do need stronger durability, pair RDB with AOF — that’s hybrid persistence, the recommended setup.