Cloud Storage and Databases - DevOps

The cloud offers a dozen different ways to store data, and picking the wrong one is a fast way to burn money or build something painfully slow. Let’s break down the main categories and when to use each.

Object Storage

Think of object storage as a giant key-value store for files. We upload a file, get a URL, and that’s it. No folders (the “folders” in S3 are just prefixes in the key name), no file system, no mounting.

Use for: static assets (images, videos), backups, log archives, data lake files, static website hosting.

Examples: AWS S3, Google Cloud Storage, Azure Blob Storage.

# upload a file to S3
aws s3 cp backup.tar.gz s3://my-bucket/backups/backup-2024-03-15.tar.gz

# make a file publicly readable (careful!)
aws s3api put-object-acl --bucket my-bucket \
  --key public/logo.png --acl public-read

# sync a directory (like rsync for S3)
aws s3 sync ./dist s3://my-website-bucket --delete

Object storage is insanely cheap for large amounts of data. S3 Standard costs about $0.023 per GB/month. For rarely accessed data, S3 Glacier drops to $0.004/GB.

Block Storage

Block storage is a virtual hard drive that we attach to a VM. It shows up as a disk device and we can format it with any filesystem.

Use for: OS disks, database storage, anything that needs a traditional filesystem.

Examples: AWS EBS, Google Persistent Disk, Azure Managed Disks.

The key difference from object storage: block storage is attached to one instance and provides low-latency, high-IOPS access. Object storage is accessed over HTTP and is for bulk data.

File Storage (Shared)

Sometimes multiple servers need to read/write the same files. That’s where network file storage comes in — it’s like NFS in the cloud.

Use for: shared configuration, CMS media directories, legacy apps that expect a filesystem.

Examples: AWS EFS, Google Filestore, Azure Files.

Storage Comparison

Picking the Right Storage

Object (S3) — files via HTTP, unlimited scale, cheapest

Block (EBS) — virtual disk, one VM, fastest IOPS

File (EFS) — shared NFS, multiple VMs, more expensive

Database — structured data, queries, transactions

Managed Databases

Instead of installing PostgreSQL on an EC2 instance and managing backups, patches, and replication ourselves, we can use a managed database service. The provider handles the infra; we handle the data.

Relational (SQL):

AWS RDS — MySQL, PostgreSQL, MariaDB, Oracle, SQL Server
Google Cloud SQL — MySQL, PostgreSQL, SQL Server
Azure Database — MySQL, PostgreSQL, SQL Server

NoSQL:

DynamoDB (AWS) — key-value / document, single-digit ms latency
Firestore (GCP) — document database for apps
Cosmos DB (Azure) — multi-model, globally distributed

Why managed? Automatic backups, point-in-time recovery, read replicas, automatic failover, and patching. All the ops stuff that keeps DBAs up at night — handled for us.

Caching

For data that’s read way more than written, sticking a cache in front of the database can cut response times from 50ms to 1ms.

# common pattern:
# 1. Check Redis cache
# 2. If hit → return cached data (fast!)
# 3. If miss → query database → store in cache → return

AWS ElastiCache — managed Redis or Memcached
Google Memorystore — managed Redis
Azure Cache for Redis — managed Redis

Choosing the Right Storage

# Quick decision guide:
# Storing user uploads, images, backups?    → Object storage (S3)
# Need a disk for a VM or database?         → Block storage (EBS)
# Multiple servers need shared files?       → File storage (EFS)
# Structured data with queries?             → Managed database (RDS)
# Need sub-millisecond reads?               → Cache (Redis)

In simple language, cloud storage is about matching the tool to the job. S3 for files, EBS for disks, RDS for structured data, Redis for speed. Using the wrong type works but costs more and performs worse.