Skip to main content

ConcurrentHashMap in Java

ConcurrentHashMap: Deep Dive & Backend Working

If you want to build high-performance, multithreaded applications in Java, ConcurrentHashMap (found in java.util.concurrent) is undeniably one of the most critical classes you must master. It is an absolute masterpiece of backend engineering designed for one singular purpose: High-speed thread-safe operations.

Unlike Hashtable (which locks the entire map) or a fully synchronized HashMap, ConcurrentHashMap allows multiple threads to read and write simultaneously without locking the entire data structure.

ConcurrentHashMap<Integer, String> map = new ConcurrentHashMap<>();

map.put(1, "Java");
map.put(2, "Spring");

System.out.println(map.get(1)); // Output: Java

1. Internal Architecture & The volatile Keyword

In Java 8 and beyond, the internal structure closely mirrors a standard HashMap (an array of buckets containing nodes). But the Node implementation holds a massive concurrent secret:

static class Node<K,V> { final int hash; final K key; volatile V val; volatile Node<K,V> next; }

Notice the volatile keyword. This is heavily responsible for thread visibility. In a multithreaded environment, CPUs cache variables. If Thread-1 updates a value, Thread-2 might still see the old, cached value. volatile forces the CPU to write and read directly from Main Memory. If Thread-1 updates val = "Updated", Thread-2 sees it instantaneously—preventing disastrous caching bugs.


2. The Architecture Evolution: Java 7 vs Java 8

To truly appreciate the current architecture, you must understand how it evolved.

Before Java 8 (Segmented Locking)

Java 7 used an array of Segments. The map was divided into 16 default segments, and each segment had its own lock. This meant a maximum of 16 threads could write simultaneously. It was an improvement over Hashtable, but the Segment architecture wasted memory and still carried locking overhead.

Java 8+ (CAS + Bucket-Level Locking)

Java 8 completely ripped out the Segments. It now relies heavily on atomic CPU instructions and locks only at the exact bucket (node) level.

[ BUCKET LEVEL LOCKING VISUALIZED ] Bucket 1 → Locked by Thread A (Writing) Bucket 2 → Unlocked (Available) Bucket 3 → Unlocked (Available) Bucket 4 → Locked by Thread B (Writing) Bucket 5 → Thread C (Reading completely unblocked!)

3. Backend Working of put(): The Magic of CAS

When you execute map.put(10, "Java");, a sophisticated multi-step algorithm executes to guarantee speed and safety.

  • STEP 1 — Hash Calculation: Java calculates spread(key.hashCode()) to ensure bits are evenly distributed.
  • STEP 2 — Bucket Index: It calculates the bucket index: (n - 1) & hash. Assume this resolves to bucket 4.
  • STEP 3 — The CAS Operation: If the bucket is completely empty, Java does not use a lock. Instead, it attempts a lock-free insertion using CAS (Compare-And-Swap).

What is CAS (Compare-And-Swap)?

CAS is a lightning-fast, atomic CPU-level instruction. The logic is beautifully simple:
IF (current bucket state == empty) THEN (insert new node) ELSE (fail).
Because it runs at the hardware level, no threads are put to sleep, and no locks are created. If it succeeds, the insertion is extremely fast.

What if CAS Fails? (Collisions)

If two threads try to insert into the exact same bucket simultaneously, CAS will succeed for the first and fail for the second. When CAS fails, Java falls back to a synchronized(bucket) block. It strictly locks only that specific bucket chain. All other millions of buckets remain 100% accessible to other threads.


4. Backend Working of get(): Non-Blocking Reads

The absolute best part of ConcurrentHashMap is its read performance. get() operations are almost entirely NON-BLOCKING.

When you fetch a value, Java calculates the hash, jumps to the bucket, and traverses the chain. Because the val and next pointers are marked volatile, the reading thread is guaranteed to see the most up-to-date data. It doesn't need to acquire any locks, allowing thousands of threads to read simultaneously without waiting.


5. Advanced Mechanics: Treeification & Parallel Resizing

Treeification

Just like Java 8's HashMap, if multiple keys collide and form a massive linked list inside a single bucket (size $\ge 8$), Java upgrades the linked list into a Red-Black Tree. This drops the worst-case search complexity from $O(n)$ down to a highly optimized $O(\log n)$.

Parallel Resizing & ForwardingNode

When a standard HashMap gets full, it resizes—a brutal process where one thread painstakingly moves every node to a new array.

ConcurrentHashMap turns resizing into a team effort. If a thread attempts to put() data and notices the map is currently resizing, it pauses its insertion and actually helps move buckets to the new array! Once a bucket is successfully moved, Java leaves behind a special marker called a ForwardingNode, which simply tells other threads: "This data has migrated. Go look in the new table."


6. Why are Nulls Banned?

Unlike HashMap, ConcurrentHashMap absolutely DOES NOT allow null keys or null values.

map.put(null, "Value"); // Throws NullPointerException map.put(1, null); // Throws NullPointerException

The Concurrency Ambiguity Problem: In a single-threaded environment, if map.get(key) returns null, you can simply call map.containsKey(key) to check if the key doesn't exist, or if the stored value is literally null.
In a multithreaded environment, between your get() call and your containsKey() call, another thread could have modified the map! Because you can never be mathematically certain what a null return actually implies, Java completely forbids it to prevent fatal logic bugs.


7. The Ultimate Comparison

Feature HashMap Hashtable (Legacy) ConcurrentHashMap
Thread-safe No Yes Yes
Locking Mechanism None Locks the WHOLE map Bucket-Level + CAS
Performance Fast (Single thread) Extremely Slow Extremely Fast
Null Keys / Values Yes / Yes No / No No / No
Reads Blocking No Yes No (Non-blocking)

The Real-Life Analogy

  • Hashtable is a massive bank with only one cashier. Everyone waits in a single, agonizing line, regardless of what they want to do.
  • ConcurrentHashMap is a modern bank with thousands of automated tellers. Dozens of people are depositing, withdrawing, and checking balances simultaneously without ever bumping into one another.

Where is it used?

You will find ConcurrentHashMap at the very foundation of modern architecture. It heavily powers the Spring Framework (managing bean registries), Caching Systems, High-performance Web Servers, Apache Kafka, and almost every major distributed system written in Java over the last decade.

Comments

Popular posts from this blog

How I Got Selected in MNC

Virtusa Sometimes success does not come from having the best coding skills or the perfect roadmap. Sometimes it comes from simply refusing to quit. This is the honest story of how I transitioned from a confused, rejected fresher to getting selected as an Associate Engineer at Virtusa. The Beginning: Confused About My Future After completing my graduation, I stared blankly at my career options. Like many freshers, I lacked a clear direction. Should I join a Java course? Should I prepare on my own? Should I just wait for campus placement opportunities? One day, I called my friend Chetan. He suggested I join Naresh i Technologies and start learning Java seriously. Still unsure of my path, I told him I needed time to think about it. A couple of days later, my phone buzzed with a WhatsApp message offering a job opportunity. They asked me to come for the next round of the recruitment process. Excitement completely took over. I packed my bags, traveled to th...

Spring Boot Introduction

Spring Boot Introduction: Architecture, Dependencies, and Embedded Servers Modern enterprise applications demand rapid development, frictionless deployment, and absolute minimal configuration. Before Spring Boot arrived, developers utilizing the Spring Framework wasted immense amounts of time configuring XML files, managing clashing dependencies, setting up clunky application servers, and stitching various Spring modules together manually. To eliminate these bottlenecks, Pivotal introduced Spring Boot . Built entirely on top of the traditional Spring Framework, Spring Boot is an "opinionated" framework. It aggressively simplifies application development by injecting auto-configuration, packaging starter dependencies, and embedding web servers directly into your application. This allows backend developers to focus entirely on building business logic rather than wrestling with infrastructure setup. What is Spring Boot? Spring Boot is a powerful extens...

Data Types in C

C Programming: Understanding Data Types Think of your kitchen. You store a large bag of flour in a big bin, a pinch of saffron in a tiny jar, and milk in a liquid measuring jug. You do not put liquids into paper bags, and you do not use a massive bucket for a single teaspoon of sugar. C programming works the exact same way. When you create a variable, you must tell the computer exactly what kind of "container" to build in its memory. We call these containers Data Types . They dictate what kind of data the container holds, how much space it takes up, and what operations you can perform on it. 1. Primitive Data Types C offers several built-in, "primitive" data types. Think of these as the fundamental storage containers. int (Integer): You use this to store whole numbers without decimals. Real-life example: Counting the number of people in a room or tracking a player's score in a video game. char (Character): You use this ...