Message Queues

intermediate 2-4 YOE system-design message-queue Kafka RabbitMQ async

A message queue is a component that sits between two services and holds messages until the receiver is ready to process them. Think of it like a mailbox — the sender drops a letter in, and the receiver picks it up when they’re available. The sender doesn’t have to wait for the receiver to be home.

This is the foundation of asynchronous processing — doing work later instead of right now.

Why We Need Message Queues

Without a queue, services talk to each other synchronously. Service A calls Service B and waits. If B is slow or down, A is stuck.

With a queue:

  • A drops a message in the queue and moves on immediately
  • B picks it up whenever it’s ready
  • If B crashes, the message stays in the queue — nothing is lost

This gives us decoupling, resilience, and scalability.

The Core Pattern

Producer → Queue → Consumer
Producer [ msg3 | msg2 | msg1 ] Consumer
Producer sends messages at its own pace
Consumer processes messages at its own pace
Queue holds messages in between
  • Producer — The service that creates and sends messages
  • Queue — The buffer that holds messages
  • Consumer — The service that reads and processes messages

Point-to-Point vs Pub/Sub

Point-to-Point (Queue) — Each message is consumed by exactly one consumer. Once processed, the message is removed. Like a task queue where each task is done once.

Pub/Sub (Topic) — Each message can be consumed by multiple subscribers. The message stays available for all subscribers. Like a broadcast — everyone who’s listening gets the message.

PatternDeliveryUse Case
Point-to-PointOne consumer per messageTask queues, job processing
Pub/SubAll subscribers get every messageNotifications, event streaming, analytics

When to Use Message Queues

Decoupling services — The order service doesn’t need to know about the email service. It just publishes “order placed” and moves on. The email service subscribes and sends the confirmation.

Handling traffic spikes — During a flash sale, we get 100x the normal orders. The queue absorbs the spike. Workers process orders at a steady rate.

Retry and error handling — If processing fails, the message goes back to the queue. It’ll be retried instead of lost. We can even have a dead letter queue (DLQ) for messages that fail repeatedly.

Heavy async work — Sending emails, generating reports, processing images, encoding videos — none of these need to happen during the user’s request. Drop a message in the queue and respond to the user immediately.

Kafka

  • Distributed event streaming platform
  • Extremely high throughput (millions of messages/sec)
  • Messages are persisted to disk and retained for days/weeks
  • Consumers can replay messages from any point in time
  • Great for: event sourcing, log aggregation, real-time analytics

RabbitMQ

  • Traditional message broker
  • Supports complex routing (exchanges, bindings)
  • Messages are removed after consumption
  • Lower throughput than Kafka but more flexible routing
  • Great for: task queues, RPC patterns, complex routing

Amazon SQS

  • Fully managed queue service from AWS
  • No infrastructure to manage
  • Two flavors: Standard (at-least-once, unordered) and FIFO (exactly-once, ordered)
  • Great for: AWS-native apps, simple queuing needs

Quick Comparison

FeatureKafkaRabbitMQSQS
ThroughputVery highMediumMedium
Message retentionDays/weeksUntil consumedUp to 14 days
OrderingPer partitionPer queueFIFO variant only
ReplayYesNoNo
Managed optionConfluent CloudCloudAMQPAWS native

Message Queues in System Design

In interviews, bring up message queues whenever we have:

  • Work that doesn’t need to happen immediately
  • Services that should be independent of each other
  • Traffic that’s bursty or unpredictable
  • Operations that might fail and need retries

A common pattern in system design interviews:

User uploads video → API Server → Queue → Video Processing Workers → Store in S3

                   Return "processing..." to user immediately

In simple language, a message queue lets us say “I’ll deal with this later” instead of doing everything right now. It makes our systems more resilient, more scalable, and better at handling the unpredictable real world.