Hash Table Study Guide

About This Project

This project is a C++ implementation of hash tables, including both open addressing and closed addressing (chaining) strategies. It is built with a focus on correctness, clarity, and testability. This was made for service learning project to educate lower divison students.

Included are:

A fully templated, reusable HashTable base class.
Specialized implementations for OpenAddressingHash and ClosedAddressingHash.
Comprehensive unit tests written using Google Test (GTest) framework.

The unit tests verify:

Correctness of insert, search, and removal operations
Handling of collisions
Proper resizing behavior
Prevention of duplicate insertions
Rehashing functionality

What Are Hash Tables?

Hash tables are data structures used to perform insert, search, and delete operations in constant average time — O(1). They are also known as:

Associative arrays
Hash maps
Dictionaries
Unordered maps

But before we implement them, we need to understand a few core concepts.

What Are Hashes?

Imagine you're trying to determine if something is unique. A computer can't "look" at something visually like we can — instead, it uses hashing, which maps input values to a unique numerical representation.

There are many ways to hash data. In this guide, we focus on modulo hashing, and later in C++ we'll use std::hash combined with modulo.

Modulo Hashing

The modulo operator (%) gives us the remainder after division.

Example:

If we have:

X = 4172025 (some student ID)
Table size = 25

Then:

Hash(X) = X % 25 = 4172025 % 25 = 1

Practice Questions:

Table size = 26, value = 1023 → Index?
Table size = 67, value = 643 → Index?
Table size = 21, value = 276 → Index?

What's the Fastest Data Structure?

Hash tables can outperform:

Linked lists: O(N) search
Balanced BSTs: O(log N) search, insert, delete

With hashing, we can reach O(1) performance in average cases for insert, search, and delete. However, we must handle collisions carefully.

Collisions

A collision happens when two different inputs hash to the same index.

Example:

If our table size is 25:

Hash(4172025) = 1
Hash(1234526) = 1

This is a collision.

There are two primary ways to handle collisions:

Open Addressing
Closed Addressing (Chaining)

Open Addressing

In open addressing, all elements are stored directly in the array. On collision, we probe to find the next available slot.

Formula (Linear Probing):

Index = (Hash(key) + i) % Table Size

Where:

i starts at 0 and increments with each collision

We use tombstones to handle deletions, marking removed slots as REMOVED.

Open Addressing – C++ Implementation

We use an abstract base class HashTable, and then implement OpenAddressingHash that handles:

OPEN, FILLED, REMOVED states
std::hash as our hash function
Collision resolution with linear probing

Open Addressing: Search Pseudocode

Function Search(data):
    Compute hash index = Hash(data) % table size
    For i from 0 to table size:
        index = (hash + i) % table size
        If slot is OPEN:
            Break (data not in table)
        If slot is FILLED and value == data:
            Return value
    Return null

Open Addressing: Removal Pseudocode

Function Removal(data):
    Compute hash index = Hash(data) % table size
    For i from 0 to table size:
        index = (hash + i) % table size
        If slot is OPEN:
            Break (data not in table)
        If slot is FILLED and value == data:
            Mark as REMOVED
            Decrement count
            Return value
    Return null

Open Addressing: Resize Pseudocode

Function resize():
    new_capacity = table.size * 2
    Create new_table of size new_capacity, filled with (default, OPEN)
    For each (value, status) in old table:
        If status is FILLED:
            Compute new index and insert into new_table with linear probing
    Replace table with new_table

Open Addressing: Insert Pseudocode

Function Insert(data):
    If load factor > LOAD_SIZE:
        resize()
    Compute hash index = Hash(data) % table size
    For i from 0 to table size:
        index = (hash + i) % table size
        If FILLED and value == data:
            Return false (duplicate)
        If not FILLED:
            Insert (data, FILLED)
            Increment count
            Return true
    Return false

Closed Addressing (Chaining)

This is what most people think of when they hear "hash table with buckets."

Each slot (bucket) in the table holds a data structure (commonly a linked list).
This structure can be anything: list, set, even a BST.

Tradeoffs:

Structure in Bucket	Worst Case	Notes
Linked List	O(K)	Simple, may degrade to linear
Balanced Set (e.g. std::set)	O(log K)	Better search but slightly slower inserts

Where K = number of items in a single bucket

Closed Addressing – C++ Implementation

We again extend HashTable<T>, using:

std::vector<std::list<T>> as the table
std::find to simplify operations

Closed Addressing: Search Pseudocode

Function Search(data):
    hash index = Hash(data) % table size
    bucket = table[hash index]
    If data in bucket:
        Return data
    Else:
        Return null

Closed Addressing: Removal Pseudocode

Function Removal(data):
    hash index = Hash(data) % table size
    bucket = table[hash index]
    If data in bucket:
        Store value
        Remove from bucket
        Decrement count
        Return value
    Else:
        Return null

Closed Addressing: Resize Pseudocode

Function resize():
    newCap = table size * 2
    Create new table of size newCap (vector of lists)
    For each bucket in old table:
        For each item in bucket:
            Compute new index
            Insert into new table
    Replace old table

Closed Addressing: Insert Pseudocode

Function Insert(data):
    If (count / table size) > LOAD_SIZE:
        resize()
    hash index = Hash(data) % table size
    bucket = table[hash index]
    If data not in bucket:
        Add to bucket
        Increment count
        Return true
    Else:
        Return false

Final Thoughts

Open Addressing gives tighter memory usage but can suffer from clustering and performance degradation in edge cases.
Closed Addressing offers consistent performance, with more flexibility in how buckets store data.
Both methods target O(1) average-case time complexity, but worst cases differ depending on the implementation.
This is one of the first composite data structures you’ll encounter — combining hashing with internal storage like lists or trees.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
.vscode		.vscode
include		include
src		src
test		test
.gitignore		.gitignore
CMakeLists.txt		CMakeLists.txt
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Hash Table Study Guide

About This Project

What Are Hash Tables?

What Are Hashes?

Modulo Hashing

Example:

Practice Questions:

What's the Fastest Data Structure?

Collisions

Example:

Open Addressing

Formula (Linear Probing):

Open Addressing – C++ Implementation

Open Addressing: Search Pseudocode

Open Addressing: Removal Pseudocode

Open Addressing: Resize Pseudocode

Open Addressing: Insert Pseudocode

Closed Addressing (Chaining)

Tradeoffs:

Closed Addressing – C++ Implementation

Closed Addressing: Search Pseudocode

Closed Addressing: Removal Pseudocode

Closed Addressing: Resize Pseudocode

Closed Addressing: Insert Pseudocode

Final Thoughts

About

Releases

Packages

Languages

XOR-SABER/HashTableStudyGuide

Folders and files

Latest commit

History

Repository files navigation

Hash Table Study Guide

About This Project

What Are Hash Tables?

What Are Hashes?

Modulo Hashing

Example:

Practice Questions:

What's the Fastest Data Structure?

Collisions

Example:

Open Addressing

Formula (Linear Probing):

Open Addressing – C++ Implementation

Open Addressing: Search Pseudocode

Open Addressing: Removal Pseudocode

Open Addressing: Resize Pseudocode

Open Addressing: Insert Pseudocode

Closed Addressing (Chaining)

Tradeoffs:

Closed Addressing – C++ Implementation

Closed Addressing: Search Pseudocode

Closed Addressing: Removal Pseudocode

Closed Addressing: Resize Pseudocode

Closed Addressing: Insert Pseudocode

Final Thoughts

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages