Bitmaps & HyperLogLog

advanced redis bitmaps hyperloglog analytics

These two types are how Redis handles “count things” at scale. Both are dramatically more memory-efficient than the naive solutions, and both are interview favorites because they show off Redis’s range beyond plain key-value caching.

Bitmaps — a giant array of bits

A bitmap isn’t really its own type — it’s a string treated as an array of bits. Each bit can be set, cleared, or read by offset. If we have 10 million users and want to track “did user X do action Y today?”, a bitmap uses 1 bit per user = 1.25 MB total. A set of user IDs would be ~80 MB. That’s a 64x win.

# Mark user 42 as logged in today
SETBIT logins:2026-05-26 42 1

# Was user 42 logged in?
GETBIT logins:2026-05-26 42
# (integer) 1

# How many users logged in today?
BITCOUNT logins:2026-05-26
# (integer) 1

Daily Active Users — the killer pattern

SETBIT dau:2026-05-26 42 1
SETBIT dau:2026-05-26 17 1
SETBIT dau:2026-05-26 99 1

BITCOUNT dau:2026-05-26
# (integer) 3

Bit operations across days

BITOP runs AND/OR/XOR across multiple bitmaps:

SETBIT dau:2026-05-24 42 1
SETBIT dau:2026-05-25 42 1
SETBIT dau:2026-05-26 42 1

BITOP AND active:3days dau:2026-05-24 dau:2026-05-25 dau:2026-05-26
BITCOUNT active:3days
# Users active on ALL 3 days — built-in retention analysis

In simple language: AND across daily bitmaps gives us a retention cohort with zero application code.

Bitmap: dau:2026-05-26
0
0
1
0
1
0
0
… up to user N …
1
bit offset = user ID, BITCOUNT = active users

Caveats

  • The offset is a uint32 — max ~4 billion bits.
  • Bitmaps work best when IDs are dense and sequential. Sparse IDs (UUIDs as strings → hash to int) waste bits.
  • SETBIT key 1000000 1 on an empty key allocates 125 KB at once. Plan for that.

HyperLogLog — counting uniques in 12KB

HyperLogLog (HLL) is a probabilistic data structure. It estimates the cardinality (number of unique elements) of a stream with ~0.81% standard error, using a fixed 12 KB regardless of input size.

That’s the magic: count uniques across 1 billion items in 12 KB. A set storing those would need gigabytes.

PFADD visitors:2026-05-26 "user:42" "user:17" "user:99"
PFADD visitors:2026-05-26 "user:42"        # already counted

PFCOUNT visitors:2026-05-26
# (integer) 3   — approximate

Combining multiple HLLs

PFADD visitors:2026-05-24 "user:42" "user:50"
PFADD visitors:2026-05-25 "user:42" "user:99"
PFADD visitors:2026-05-26 "user:17"

# Unique visitors across all 3 days
PFCOUNT visitors:2026-05-24 visitors:2026-05-25 visitors:2026-05-26

# Store a merged HLL
PFMERGE visitors:week visitors:2026-05-24 visitors:2026-05-25 visitors:2026-05-26
PFCOUNT visitors:week
HyperLogLog
stream:
1B events
input
→ hash + bucket →
12 KB sketch
(2^14 registers)
constant size
→ PFCOUNT →
~10M uniques
±0.81%
estimate

When to use which

  • Need exact unique count and have known dense IDs (user IDs 1..N)? Use a bitmap.
  • Need approximate unique count over arbitrary strings (IPs, sessions, queries)? Use HyperLogLog.
  • Need exact uniques over sparse data and you have plenty of RAM? Use a set.
  • Need set algebra (union/intersection) with exact precision? Sets or bitmaps (via BITOP).

Both bitmaps and HLLs are the kind of tools that quietly save thousands in infrastructure cost once a product hits real scale.