Location>code7788 >text

The most commonly used collections - arraylist in detail

Popularity:588 ℃/2024-08-30 22:36:28

Introduction to ArrayList

ArrayListRealizedListinterface, which is a sequential container, i.e., the elements hold the data in the same order as they were put in, null elements are allowed to be put in, and the bottom layer is passed through theArray Implementation. Except for this class, which does not implement synchronization, the rest of the followers of theVectorroughly the same. EachArrayListThere is a capacity, which represents the actual size of the underlying array, and the number of elements stored in the container cannot be more than the current capacity. When adding elements to the container, if the capacity is insufficient, the container will automatically increase the size of the underlying array.

ArrayList in JDK1.8 before and after the implementation of the difference:

  • JDK1.7: Like Hungry Man style, directly create an array with an initial capacity of 10
  • JDK1.8: like lazy, create an array of length 0 at the beginning, and then create an array with an initial capacity of 10 when the first element of add is added

The size(), isEmpty(), get(), and set() methods all complete in constant time, the time overhead of the add() method is related to the insertion position, and the time overhead of the addAll() method is proportional to the number of elements added. The remaining methods are mostly linear time.

For the pursuit of efficiency, ArrayList does not implement synchronization (synchronized), if you need to access multiple threads concurrently, the user can manually synchronize, you can also use the Vector instead!

Introduction to the underlying principles

underlying data structure

// Collection default capacity 10;
private static final int DEFAULT_CAPACITY = 10;

// empty array
private static final Object[] EMPTY_ELEMENTDATA = {};

// Empty array with default capacity
private static final Object[] DEFAULTCAPACITY_EMPTY_ELEMENTDATA = {};

// Array of the actual data stored in the collection
transient Object[] elementData; // non-private to simplify nested class access

 // The number of elements in the collection, not the length of the array.
private int size; // private object[] elementData; // private object[] elementData; // private class access

constructor method

public ArrayList() {
    // Assign the default empty array in the properties to the variable that stores the data.
     = DEFAULTCAPACITY_EMPTY_ELEMENTDATA.

    // Equivalent to = {}
}

// Parametric construction
public ArrayList(int initialCapacity) {
    // Given an initial capacity, create an array of that size.
   if (initialCapacity > 0) {
         = new Object[initialCapacity];
   } else if (initialCapacity == 0) {
        // if 0 is passed assign {} to elementData
         = EMPTY_ELEMENTDATA.
        // equals = {}
   } else {
        //If a negative number is passed, an exception is thrown
        //: Illegal Capacity: -20
        throw new IllegalArgumentException("Illegal Capacity: " + initialCapacity);
   }
}

automatic capacity expansion

Whenever an element is added to an array, it is necessary to check whether the number of elements added will exceed the length of the current array, and if it does, the array will be expanded to meet the needs of the added data.

private void grow(int minCapacity) {
    // overflow-conscious code
    int oldCapacity = ;
    
    //Dynamic expansion,Expansion to the original1.5times (multiplier),Shift one place to the right, i.e. half of the original
    int newCapacity = oldCapacity + (oldCapacity >> 1);
    if (newCapacity - minCapacity < 0)
        newCapacity = minCapacity;
        
    //Determine if the new capacity will exceed the maximum limit
    if (newCapacity - MAX_ARRAY_SIZE > 0)
        newCapacity = hugeCapacity(minCapacity);
    // minCapacity is usually close to size, so this is a win:    
    elementData = (elementData, newCapacity);//Copy operations on arrays
}

Expansion methodology flow:

  1. First get the length of the array

  2. Expand the new capacity of the array to 1.5 times the capacity of the original array rounded to the nearest whole number.

  3. Compare the new capacity with the current required minimum capacity, (the minimum capacity is obtained in the add method, minCapacity=size+1, i.e., the number of elements in the original array plus 1), and newCapacity=*1.5, which must be 1.5 times larger than +1 in general. But here we have to consider the situation when the array is empty. Array is empty is divided into two cases: ① specified the array capacity of 0 ② did not explicitly specify the size of the array.

    • When the array is empty when the insertion operation, because the number of elements size for 0, the array capacity is also 0, then the expansion operation will be carried out for the empty array, expansion 1.5 times your capacity is still 0, then this time will be less than the minimum capacity I need (that is, 1), at this time it will make the newCapacity = minCapacity.

    • For ①, the minCapacity passed to the grow method = 1, so its expanded capacity is 1

    • For ②, in the ensureCapacityInternal method, make minCapacity = DEFAULT_CAPACITY (10), so the length of the expanded array is DEFAULT_CAPACITY, which is 10.

      • The reason is that in the parametric constructor = EMPTY_ELEMENTDATA; (in the nonparametric constructor = DEFAULTCAPACITY_EMPTY_ELEMENTDATA;), then in the ensureCapacityInternal method will be judged, so for ①, the minCapacity = 1; and for ②, minCapacity = (DEFAULT_CAPACITY, minCapacity), i.e., minCapacity = 10, and for ②, minCapacity = (DEFAULT_CAPACITY, minCapacity). minCapacity = 1; and for ②, minCapacity = (DEFAULT_CAPACITY, minCapacity), that is, minCapacity = 10
if (elementData == DEFAULTCAPACITY_EMPTY_ELEMENTDATA) {
    //compare the sizes, at this point minCapacity = 10
    minCapacity = (DEFAULT_CAPACITY, minCapacity);
}
  1. Finally, determine whether the new capacity size is greater than the maximum value of the default array (Integer.MAX_VALUE-8), then give it the maximum value of the integer type
  2. After expansion, the () method is called to make a copy of the array.

In fact, copying an array requires creating a new array and copying the original array, which can cause resource consumption. Therefore, before adding a large number of elements, it is recommended to increase the capacity of the ArrayList instance by using the ensureCapacity operation, and then copy a small amount of the array data before adding the elements.

add(), addAll()

The add operation may result in a lack of capacity, so before adding an element, it is necessary to check the remaining space and automatically expand it if necessary. This is done through the grow() method.

Assuming an empty-parameter construct, the first time an element is added add(1)

public boolean add(E e) {
    ensureCapacityInternal(size + 1); // Increments modCount!
    ensureCapacityInternal(size + 1); // Increments modCount!!!!
    // Add the element to be added to the next position in the array that has data in it
    elementData[size++] = e; }; return true; // Add the element to be added to the next position in the array that has data.
    return true; }
}

private void ensureCapacityInternal(int minCapacity) {//First addition: minCapacity = 1
    // Case of parametric construction: new Object[10] ! = {}, will not execute the statement inside the if. Even if the parametric construct gives 0, it will not be executed because elementData = EMPTY_ELEMENTDATA, which is not equal to DEFAULTCAPACITY_EMPTY_ELEMENTDATA at this time
    // In the case of a parameterless construct: {} = {} will execute the statement
    if (elementData == DEFAULTCAPACITY_EMPTY_ELEMENTDATA) {
    // compare sizes, in this case minCapacity = 10
        minCapacity = (DEFAULT_CAPACITY, minCapacity);
    }
    // Explicit capacity of the array
    ensureExplicitCapacity(minCapacity);
}

private void ensureExplicitCapacity(int minCapacity) {
    modCount++;// Record the number of times the collection has been operated on.
    // overflow-conscious code
    if (minCapacity - > 0)
        grow(minCapacity);// grow operation
}

The addAll() method is able to add multiple elements at once, and there are two versions of it depending on the location.

  • The addedAll(Collection<? extends E> c) method.

  • The addAll(int index, Collection<? extends E> c) method

Similar to the add() method, space checking is also required before insertion, and if needed, it is automatically expanded; if inserting from a specified position, there is also the case of moving elements. The time complexity of addAll() is not only related to the number of elements inserted, but also to the location of insertion.

set()

Since the underlying layer is an array, the set() method is a direct assignment to the specified position of the array.

public E set(int index, E element) {
    rangeCheck(index);// subscript out-of-bounds checking
    E oldValue = elementData(index);
    elementData[index] = element;//assign the value to the specified location, copying just the reference
    return oldValue;
}

get()

Because the underlying is an array, get() method is also directly from the array index to get the value, the only thing to note is that because the underlying array is Object [], you need to type conversion after getting the element.

public E get(int index) {
    return (E) elementData[index];//note the type conversion.
    return (E) elementData[index];// note the type conversion
}

remove method

The remove() method also has two

  • remove(int index) removes the element at the specified position.

  • remove(Object o) removes the first element that satisfies (elementData[index]).

The delete operation is the inverse of the add() operation and will require the element after the delete point to be moved forward one position

public E remove(int index) {
    rangeCheck(index);

    rangeCheck(index); modCount++;
    E oldValue = elementData(index);

    E oldValue = elementData(index); int numMoved = size - index - 1; if (numMoved > 0)
    if (numMoved > 0)
    // Determine whether the index to be deleted is the last one, if it is not the last one, then we need to copy the array
        (elementData, index+1, elementData, index.
                             numMoved); //then place the last element in the array.
        // Then set the last element to null to allow GC to work
    elementData[--size] = null; // clear to let GC do its work

    return oldValue; }
}

trimToSize()

Function to adjust the capacity of the underlying array to the size of the actual elements saved in the current list

 /**
     * Trims the capacity of this <tt>ArrayList</tt> instance to be the
     * list's current size.  An application can use this operation to minimize
     * the storage of an <tt>ArrayList</tt> instance.
     */
    public void trimToSize() {
        modCount++;
        if (size < ) {
            elementData = (size == 0)
              ? EMPTY_ELEMENTDATA
              : (elementData, size);
        }
    }

indexOf(), lastIndexOf()

Gets the index of the first occurrence of the element.

public int indexOf(Object o) {
        if (o == null) {
            for (int i = 0; i < size; i++)
                if (elementData[i]==null)
                    return i;
        } else {
            for (int i = 0; i < size; i++)
                if ((elementData[i]))
                    return i;
        }
        return -1;
    }

Gets the index of the last occurrence of the element.

public int lastIndexOf(Object o) {
        if (o == null) {
            for (int i = size-1; i >= 0; i--)
                if (elementData[i]==null)
                    return i;
        } else {
            for (int i = size-1; i >= 0; i--)
                if ((elementData[i]))
                    return i;
        }
        return -1;
    }

Common pitfalls of deleting (adding) while traversing

Iterate through the list in a for loop

After deleting an element, the size of the list changes, and the index also changes, so this can cause some elements to be missed when traversing. For example, when the first element is deleted, and you continue to access the second element according to the index, the elements behind the deleted element are all moved one place forward, so the actual access is the third element. Therefore, this method can be used when deleting a specific element, but is not suitable for cyclic deletion of multiple elements.

for(int i=0;i<();i++){
    if((i).equals("del"))
        (i);
}

Solution:

// start traversing from the last element of the list

// start traversing from the last element of the list
for(int i=()-1;i>+0;i--){
    if((i).equals("del"))
        (i);
}

Enhanced for loops

Continuing the loop after deleting an element throws an exception because the element was concurrently modified while in use

for(String x:list){
    if(("del"))
        (x);
}

Solution: But only one "del" element can be deleted.

//Solve: Use break to jump out immediately after deletion, then no error will be triggered.
for(String x:list){
    if (("del")) {
         (x); break; if (("del")) {
         break.
    }
}

iterator traversal

This way you can loop and remove normally. However, it should be noted that using the remove method of the iterator, if you use the remove method of the list will also report the ConcurrentModificationException error mentioned above.

Iterator<String> it = ();
while(()){
    String x = ();
    if(("del")){
        ();
    }
}

FailFast mechanism

The ConcurrentModificationException exception mentioned above has this mechanism in place, by logging the modCount parameter. In the face of concurrent modifications, the iterator quickly fails completely, rather than risking arbitrarily indeterminate behavior at some indeterminate time in the future.

The fail-fast mechanism is an error mechanism in java collection (Collection). When multiple threads operate on the contents of the same collection, a fail-fast event may occur. For example: when a thread A through the iterator to traverse the process of a collection, if the contents of the collection was changed by other threads; then thread A traverses the collection, that is, expectedModCount ! = modCount, a ConcurrentModificationException is thrown, resulting in a fail-fast event.

if (modCount != expectedModCount)
    throw new ConcurrentModificationException();

The fail-fast mechanism does not guarantee that an exception will be thrown in the event of an unsynchronized modification, it just does its best to do so, so it is generally only used to detect bugs.

Solution for fail-fast:

  1. In the traversal process of all involved in changing modCount worthy of place all plus synchronized or directly use, so that can be solved (in fact, the Vector structure is implemented in this way). However, it is not recommended, because the synchronized lock caused by adding and deleting may block the traversal operation.
List<Integer> arrsyn = (arr);
  1. Use CopyOnWriteArrayList instead of ArrayList, which is the recommended solution, as CopyOnWriteArrayList is thread-safe for concurrency.

ArrayList and Vector and CopyOnWriteArrayList and LinkedList

Inheritance relationship structure diagram:

Difference between ArrayList and Vector and CopyOnWriteArrayList:

  • ArrayList is non-thread-safe, if you need to take thread-safety into account, then you can use Vector and CopyOnWriteArrayList;

  • The difference between Vector and CopyOnWriteArrayList is: Vector add, delete, change and check methods are added synchronized, to ensure synchronization, but each method execution time to obtain a lock, the performance will be greatly reduced, and CopyOnWriteArrayList just add locks on the add, delete, change, but do not add locks on the read, in the read performance is better than Vector, CopyOnWriteArrayList supports read more than write concurrency. In terms of read performance is better than Vector, CopyOnWriteArrayList support read more write less concurrent situations.

Difference between ArrayList and LinkedList:

  • ArrayList is based on a dynamic array implementation;

  • LinkedList is based on a linked list implementation. For random index access to the get and set methods, ArrayList is faster than LinkedList because ArrayList finds the element directly by the array subscript; LinkedList has to move the pointer to traverse each element until it is found.

  • For add(int index, E element), remove(int index) operation: LinkedList and ArrayList time complexity is the same, are O(n); although the time complexity is the same, but the actual execution time is not the same, as shown in the following code:

    List<Integer> a = ();
    List<Integer> b = ();
    
    Random r = new Random();
    (0);
    (0);
    
    long startTime = ();
    for (int i = 0; i <= 20000; i++) {
        int p = (());
        (p, 0);
    }
    (() - startTime);// 6
    
    startTime = ();
    for (int i = 0; i <= 20000; i++) {
        int p = (());
        (p, 0);
    }
    (() - startTime);// 205
    

    Although ArrayList needs to move data (forward, backward) when adding or deleting data at the index location, it is possible to manipulate the whole piece of memory for the data in a block in contiguous memory. While LinkedList needs to be checked one by one first to find the specific index location of the element, so the efficiency of the array is higher than that of the linked list in terms of addressing.

  • For add new elements: theoretically the speed of LinkedList (O(1)) is better than ArrayList (O(n)), because ArrayList in the addition and deletion of elements, may expand and copy the array; LinkedList only need to modify the pointer can be. However, in the actual test, in the case of a small amount of data, the two execution time is almost the same; increase the amount of data, you can see the difference, as shown in the following code:

    List<Integer> a = ();
    List<Integer> b = ();
    
    (0);
    (0);
    
    long startTime = ();
    for (int i = 0; i <= 2000000; i++) {
        int p = (());
        (0);
    }
    (() - startTime);// 34
    
    startTime = ();
    for (int i = 0; i <= 2000000; i++) {
        int p = (());
        (0);
    }
    (() - startTime);// 271
    

    This is becauseLinkedList has some performance issues

About the Author.

From the first-line programmer Seven's exploration and practice, continuous learning iteration in the~

This article is included in my personal blog:https://

Public number: seven97, welcome to follow~