Hash Maps & Hash Sets - Data Structures & Algorithms

A hash table is a data structure that maps keys to values using a hash function. Think of it like a magical dictionary — we give it a word (key), and it instantly tells us the definition (value). No searching, no scanning. Just boom, here’s the answer.

This “instant lookup” is what makes hash tables so powerful. Insert, lookup, and delete are all O(1) on average. That’s why they show up in almost every coding interview.

How Hash Tables Work

Here’s the basic idea:

We have an array of buckets (slots).
When we insert a key-value pair, a hash function converts the key into a number.
That number (modulo the array size) tells us which bucket to put it in.
To look up a key, we hash it again and go straight to that bucket.

Hash Table — Insert "apple" → 5

"apple" → hash("apple") = 2

[0] empty

[1] empty

[2] "apple" → 5

[3] empty

[4] empty

What Happens When Two Keys Hash to the Same Bucket? (Collisions)

This is called a collision. Two different keys produce the same hash value. It’s inevitable — we’re mapping infinite possible keys to a finite number of buckets.

Collision at Bucket [2] — Chaining

[0] empty

[1] "grape" → 8

[2] "apple" → 5 → "mango" → 3

[3] empty

Both "apple" and "mango" hash to bucket [2] — stored as a linked list

There are two main ways to handle collisions:

Chaining — Each bucket holds a linked list. If two keys hash to the same bucket, we just add to the list. To find a key, we hash it to the bucket and then walk the list. This is the most common approach.

Open addressing — If a bucket is taken, we probe (check) the next bucket, and the next, until we find an empty one. Linear probing checks the very next slot. Quadratic probing jumps by increasing amounts.

In practice, we rarely implement this ourselves. The language’s built-in hash map handles collisions for us.

Hash Map vs Hash Set

They’re siblings, not twins.

	Hash Map	Hash Set
Stores	Key-value pairs	Keys only
Use for	Looking up values by key	Checking if something exists
Example	`{name: "Alice", age: 30}`	`{"Alice", "Bob", "Charlie"}`
JS	`Map` / `{}`	`Set`
Python	`dict`	`set`
Java	`HashMap`	`HashSet`

In simple language: a hash map is a dictionary (word → definition). A hash set is a guest list (just names, no extra info).

Time Complexity

Operation	Average	Worst Case
Insert	O(1)	O(n)
Lookup	O(1)	O(n)
Delete	O(1)	O(n)

The worst case happens when everything hashes to the same bucket (all collisions). In practice, with a good hash function and proper resizing, we always get O(1).

Two Sum with Hash Map

This is THE classic hash map problem. Given an array and a target sum, find two numbers that add up to it. Return their indices.

The brute force approach checks every pair — O(n²). With a hash map, we do it in one pass.

The idea: for each number, we check if target - number already exists in our map. If yes, we found the pair. If no, we store the current number and move on.

function twoSum(nums, target) {
  const map = new Map(); // value → index

  for (let i = 0; i < nums.length; i++) {
    const complement = target - nums[i];
    if (map.has(complement)) {
      return [map.get(complement), i]; // found it!
    }
    map.set(nums[i], i); // store for later
  }
  return []; // no solution
}

console.log(twoSum([2, 7, 11, 15], 9)); // [0, 1]

def two_sum(nums, target):
    seen = {}  # value → index

    for i, num in enumerate(nums):
        complement = target - num
        if complement in seen:
            return [seen[complement], i]  # found it!
        seen[num] = i  # store for later
    return []  # no solution

print(two_sum([2, 7, 11, 15], 9))  # [0, 1]

public static int[] twoSum(int[] nums, int target) {
    Map<Integer, Integer> map = new HashMap<>(); // value → index

    for (int i = 0; i < nums.length; i++) {
        int complement = target - nums[i];
        if (map.containsKey(complement)) {
            return new int[]{map.get(complement), i}; // found it!
        }
        map.put(nums[i], i); // store for later
    }
    return new int[]{}; // no solution
}
// [2, 7, 11, 15] with target 9 → [0, 1]

Time: O(n), Space: O(n) — one pass through the array, hash map stores up to n elements.

Frequency Counter

Counting how often things appear is one of the most common hash map patterns. We can use it to find the most frequent element, check for duplicates, group items, and more.

function frequencyCount(arr) {
  const freq = new Map();
  for (const item of arr) {
    freq.set(item, (freq.get(item) || 0) + 1);
  }
  return freq;
}

const counts = frequencyCount([1, 2, 2, 3, 3, 3]);
console.log(counts); // Map { 1 → 1, 2 → 2, 3 → 3 }

// Find the most frequent element
let maxCount = 0, mostFrequent = null;
for (const [key, count] of counts) {
  if (count > maxCount) { maxCount = count; mostFrequent = key; }
}
console.log(mostFrequent); // 3

def frequency_count(arr):
    freq = {}
    for item in arr:
        freq[item] = freq.get(item, 0) + 1
    return freq

counts = frequency_count([1, 2, 2, 3, 3, 3])
print(counts)  # {1: 1, 2: 2, 3: 3}

# Find the most frequent element
most_frequent = max(counts, key=counts.get)
print(most_frequent)  # 3

public static Map<Integer, Integer> frequencyCount(int[] arr) {
    Map<Integer, Integer> freq = new HashMap<>();
    for (int item : arr) {
        freq.merge(item, 1, Integer::sum);
    }
    return freq;
}
// [1, 2, 2, 3, 3, 3] → {1=1, 2=2, 3=3}

// Find the most frequent element
int mostFrequent = Collections.max(freq.entrySet(),
    Map.Entry.comparingByValue()).getKey(); // 3

Find Duplicates with Hash Set

Need to check if something exists? That’s a hash set’s entire job. O(1) lookups, no values to worry about.

function containsDuplicate(nums) {
  const seen = new Set();
  for (const num of nums) {
    if (seen.has(num)) return true; // already seen this one
    seen.add(num);
  }
  return false;
}

console.log(containsDuplicate([1, 2, 3, 1]));    // true
console.log(containsDuplicate([1, 2, 3, 4]));    // false

def contains_duplicate(nums):
    seen = set()
    for num in nums:
        if num in seen:
            return True  # already seen this one
        seen.add(num)
    return False

print(contains_duplicate([1, 2, 3, 1]))  # True
print(contains_duplicate([1, 2, 3, 4]))  # False

public static boolean containsDuplicate(int[] nums) {
    Set<Integer> seen = new HashSet<>();
    for (int num : nums) {
        if (!seen.add(num)) return true; // add returns false if already exists
    }
    return false;
}
// [1, 2, 3, 1] → true, [1, 2, 3, 4] → false

Time: O(n), Space: O(n) — much better than the O(n²) brute force of checking every pair.

When to Use Hash Map vs Hash Set

Reach for a hash map when we need to:

Associate values with keys (two sum: value → index)
Count frequencies (character counting, word counting)
Cache computed results (memoization)

Reach for a hash set when we need to:

Check existence quickly (have we seen this before?)
Remove duplicates from a collection
Find intersection or union of two collections

The Big Picture

Hash tables are the Swiss Army knife of data structures. When we’re stuck on a brute-force O(n²) solution that checks every pair, the first question we should ask ourselves is: “Can a hash map bring this down to O(n)?” The answer is surprisingly often yes.

In simple language, a hash map gives us instant lookups by trading space for speed. We store extra data (the map itself) so we can find things in O(1) instead of scanning through the whole collection. It’s the most useful data structure in coding interviews, period.