Skip to main content

ArrayList in Java

ArrayList Deep Dive: Internal Backend Working

If you write Java, you use ArrayList. It is arguably the single most frequently instantiated class in the entire Java Collections Framework (java.util.ArrayList). But what actually happens when you create one, add items to it, or delete from it?

To master Java backend engineering, you must move beyond simply using an ArrayList to deeply understanding its internal Dynamic Resizable Array architecture.

[ The Hierarchy ] Iterable ↓ Collection ↓ List ↓ ArrayList

1. Why was ArrayList Created?

Standard Java arrays have a severe limitation: they are strictly fixed in size.

int[] arr = new int[5];

If you create an array of size 5, you can never add a 6th element. You would be forced to manually allocate a new, larger array, write a loop to copy the old elements over, and reassign references. ArrayList was engineered to handle this dynamic memory management entirely automatically.


2. Internal Data Structure: Size vs. Capacity

If you open the raw source code of the JDK and look inside the ArrayList class, you will find this crucial array:

transient Object[] elementData; private int size;

This array is the actual backend. But you must firmly separate two concepts in your mind:

  • Capacity: The physical length of the internal elementData array.
  • Size: The actual number of elements physically inserted by the user.

Visualizing Default Capacity

When you create an ArrayList and add your first element, Java secretly initializes the internal capacity to 10.

ArrayList<Integer> list = new ArrayList<>(); list.add(10); list.add(20);

Backend Memory Map: Size = 2 | Capacity = 10

10
20
_
_
_
_
_
_
_
_

3. The Magic Formula: Capacity Expansion

What happens when you insert your 11th element into an array that only holds 10? The ArrayList initiates its Growth Strategy. This is the most important backend concept to master.

Java determines the new capacity using a bitwise shift operation:

$$ \text{newCapacity} = \text{oldCapacity} + (\text{oldCapacity} \gg 1) $$

Because shifting right by 1 (>> 1) is equivalent to dividing by 2, this formula means the array grows by exactly 1.5x (50%) every time it gets full.

  • If capacity is 10 → new capacity is $10 + 5 = $ 15
  • If capacity is 15 → new capacity is $15 + 7 = $ 22
  • If capacity is 22 → new capacity is $22 + 11 = $ 33

Why exactly 1.5x?

It is a highly calculated balance. If the array grew by only 1 item at a time, performance would be destroyed by constant resizing. If it grew by 3x, memory would be massively wasted. 1.5x is the mathematical sweet spot between performance and memory conservation.

The Actual Resize Execution: Arrays.copyOf()

When the resize triggers, Java does not stretch the existing array (arrays cannot be stretched). Instead, it calls Arrays.copyOf():

  1. Creates a brand new array of the new 1.5x size.
  2. Copies all old elements into the new array.
  3. Points the elementData reference to the new array.
  4. Abandons the old array so the Garbage Collector can destroy it.

4. Why is get() Extremely Fast?

Because an ArrayList is backed by a primitive array, it uses Contiguous Memory Allocation.

When you call list.get(5000);, Java does not write a loop to count to 5000. It performs a single, instantaneous math calculation:

Address = BaseMemoryAddress + (Index × SizeOfElement)

This allows the CPU to jump directly to the exact memory block in hardware. Therefore, the time complexity for get() and set() is an unshakeable $O(1)$.


5. The Cost of Middle Insertions and Deletions

While reading is fast, inserting or deleting elements from the middle of an ArrayList is its greatest weakness.

list.add(2, 50); // Insert 50 at index 2

If you insert at index 2, every single element located at index 2 or higher must be physically pushed one slot to the right. Java accomplishes this massive shift internally using a native C++ method called System.arraycopy().

Before Insertion:

10
20
30
40
_

Step 1: Shift right to make space

10
20
_
30
40

Step 2: Insert 50

10
20
50
30
40

Because shifting $n$ elements takes $O(n)$ time, if you need to constantly insert/delete from the beginning of a collection, you should use a LinkedList instead.


6. Time Complexity Mastery

Operation Complexity Backend Reason
add() (at end) Amortized $O(1)$ Normally instant. Rarely, an $O(n)$ resize occurs, but the cost averages out to $O(1)$.
get(index) $O(1)$ Direct memory address calculation.
set(index, val) $O(1)$ Direct array reassignment.
add(index, val) $O(n)$ Requires shifting all subsequent elements to the right.
remove(index) $O(n)$ Requires shifting all subsequent elements to the left.

7. Advanced Backend Concepts

1. CPU Cache Friendliness

In real-world benchmarking, an ArrayList will almost always iterate faster than a LinkedList. Because an ArrayList stores data in contiguous (side-by-side) memory blocks, modern CPUs load the entire block into their ultra-fast L1/L2 caches. A LinkedList creates random node objects scattered all over the heap, resulting in constant Cache Misses.

2. The Fail-Fast Iterator

Inside the class is a variable called modCount. Every time you structurally modify the list (add/remove), modCount++ triggers. If you create an Iterator, it memorizes the modCount. If you try to modify the list directly while iterating, the Iterator detects the modCount mismatch and instantly throws a ConcurrentModificationException.

3. trimToSize() & ensureCapacity()

  • If you have a massive ArrayList of 1,000,000 capacity, but you remove 900,000 items, your capacity is still hogging RAM. Calling list.trimToSize() shrinks the underlying array to match the actual size.
  • If you know you are about to insert 10,000 items, calling list.ensureCapacity(10000) pre-allocates the memory, preventing the list from suffering through multiple expensive 1.5x resizing calculations.

ArrayList vs. Vector & Thread Safety

Never use ArrayList across multiple threads simultaneously—data corruption will occur because it is not thread-safe. If you need thread safety, you might be tempted to use the legacy Vector class. Do not use Vector. It locks the entire collection on every operation, destroying performance.

Instead, use the modern wrapper:

List<Integer> safeList = Collections.synchronizedList(new ArrayList<>());

Interview Checklist: Best & Worst Use Cases

Use ArrayList for: Instagram feeds, student records, caching, or any system requiring extremely fast sequential iteration, index-based reading, and mostly "append-only" insertions.

Avoid ArrayList for: Real-time undo systems, queues, or logic that requires constantly deleting elements from the absolute beginning of the collection.

Comments

Popular posts from this blog

How I Got Selected in MNC

Virtusa Sometimes success does not come from having the best coding skills or the perfect roadmap. Sometimes it comes from simply refusing to quit. This is the honest story of how I transitioned from a confused, rejected fresher to getting selected as an Associate Engineer at Virtusa. The Beginning: Confused About My Future After completing my graduation, I stared blankly at my career options. Like many freshers, I lacked a clear direction. Should I join a Java course? Should I prepare on my own? Should I just wait for campus placement opportunities? One day, I called my friend Chetan. He suggested I join Naresh i Technologies and start learning Java seriously. Still unsure of my path, I told him I needed time to think about it. A couple of days later, my phone buzzed with a WhatsApp message offering a job opportunity. They asked me to come for the next round of the recruitment process. Excitement completely took over. I packed my bags, traveled to th...

Spring Boot Introduction

Spring Boot Introduction: Architecture, Dependencies, and Embedded Servers Modern enterprise applications demand rapid development, frictionless deployment, and absolute minimal configuration. Before Spring Boot arrived, developers utilizing the Spring Framework wasted immense amounts of time configuring XML files, managing clashing dependencies, setting up clunky application servers, and stitching various Spring modules together manually. To eliminate these bottlenecks, Pivotal introduced Spring Boot . Built entirely on top of the traditional Spring Framework, Spring Boot is an "opinionated" framework. It aggressively simplifies application development by injecting auto-configuration, packaging starter dependencies, and embedding web servers directly into your application. This allows backend developers to focus entirely on building business logic rather than wrestling with infrastructure setup. What is Spring Boot? Spring Boot is a powerful extens...

Strings in C

C Programming: Working with Strings Unlike modern programming languages like Python or Java, C does not possess a dedicated "String" data type. Instead, C treats a string as a simple 1D array of characters. Real-life example: Think of a freight train. The train does not exist as one solid object; it consists of individual boxcars linked together. Similarly, a C string links individual characters side-by-side in memory. To let the computer know the train has ended, C attaches a special "caboose" called the Null Terminator ( \0 ). 1. Essential String Functions Handling strings manually requires complex loops. To save time, C provides a built-in library called <string.h> that contains powerful functions to manipulate text. strlen(): You use this to find the exact length of a string. The compiler counts the characters until it hits the \0 terminator. It does not count the terminator itself. strcpy(): You ...