Message Queues - High-Level Design

A message queue is a component that sits between two services and holds messages until the receiver is ready to process them. Think of it like a mailbox — the sender drops a letter in, and the receiver picks it up when they’re available. The sender doesn’t have to wait for the receiver to be home.

This is the foundation of asynchronous processing — doing work later instead of right now.

Why We Need Message Queues

Without a queue, services talk to each other synchronously. Service A calls Service B and waits. If B is slow or down, A is stuck.

With a queue:

A drops a message in the queue and moves on immediately
B picks it up whenever it’s ready
If B crashes, the message stays in the queue — nothing is lost

This gives us decoupling, resilience, and scalability.

The Core Pattern

Producer → Queue → Consumer

Producer → [ msg3 | msg2 | msg1 ] → Consumer

Producer sends messages at its own pace

Consumer processes messages at its own pace

Queue holds messages in between

Producer — The service that creates and sends messages
Queue — The buffer that holds messages
Consumer — The service that reads and processes messages

Point-to-Point vs Pub/Sub

Point-to-Point (Queue) — Each message is consumed by exactly one consumer. Once processed, the message is removed. Like a task queue where each task is done once.

Pub/Sub (Topic) — Each message can be consumed by multiple subscribers. The message stays available for all subscribers. Like a broadcast — everyone who’s listening gets the message.

Pattern	Delivery	Use Case
Point-to-Point	One consumer per message	Task queues, job processing
Pub/Sub	All subscribers get every message	Notifications, event streaming, analytics

When to Use Message Queues

Decoupling services — The order service doesn’t need to know about the email service. It just publishes “order placed” and moves on. The email service subscribes and sends the confirmation.

Handling traffic spikes — During a flash sale, we get 100x the normal orders. The queue absorbs the spike. Workers process orders at a steady rate.

Retry and error handling — If processing fails, the message goes back to the queue. It’ll be retried instead of lost. We can even have a dead letter queue (DLQ) for messages that fail repeatedly.

Heavy async work — Sending emails, generating reports, processing images, encoding videos — none of these need to happen during the user’s request. Drop a message in the queue and respond to the user immediately.

Popular Tools

Kafka

Distributed event streaming platform
Extremely high throughput (millions of messages/sec)
Messages are persisted to disk and retained for days/weeks
Consumers can replay messages from any point in time
Great for: event sourcing, log aggregation, real-time analytics

RabbitMQ

Traditional message broker
Supports complex routing (exchanges, bindings)
Messages are removed after consumption
Lower throughput than Kafka but more flexible routing
Great for: task queues, RPC patterns, complex routing

Amazon SQS

Fully managed queue service from AWS
No infrastructure to manage
Two flavors: Standard (at-least-once, unordered) and FIFO (exactly-once, ordered)
Great for: AWS-native apps, simple queuing needs

Quick Comparison

Feature	Kafka	RabbitMQ	SQS
Throughput	Very high	Medium	Medium
Message retention	Days/weeks	Until consumed	Up to 14 days
Ordering	Per partition	Per queue	FIFO variant only
Replay	Yes	No	No
Managed option	Confluent Cloud	CloudAMQP	AWS native

Message Queues in System Design

In interviews, bring up message queues whenever we have:

Work that doesn’t need to happen immediately
Services that should be independent of each other
Traffic that’s bursty or unpredictable
Operations that might fail and need retries

A common pattern in system design interviews:

User uploads video → API Server → Queue → Video Processing Workers → Store in S3
                         ↓
                   Return "processing..." to user immediately

In simple language, a message queue lets us say “I’ll deal with this later” instead of doing everything right now. It makes our systems more resilient, more scalable, and better at handling the unpredictable real world.