ArrayList与LinkedList

前言

对不少人来说，List恐怕是最常用的集合了。java中，List作为接口，其下有两个常见的实现类：ArrayList和LinkedList。除此之外，还提供了AbstractList抽象类，用来帮助自定义List的实现类。本文先从AbstractList入手，再对比ArrayList和LinkedList，从而将这两个常用的集合做一个大体的介绍。

在具体的介绍之前，笔者希望先讨论一个概念：fail-fast。

fail-fast是java集合框架中，一个可以通过快速抛出ConcurrentModificationException、来通知某不支持并发的集合被多个线程、进行了修改的机制。既然命名为”fast”，往往意味着作为一个快速通道，并不涵盖所有的情况。事实也正是如此。在许多集合的实现中，只有发生了结构化修改（集合的结构产生了变动，如增加、减少了结点）时，才会触发fail-fast机制。因此，fail-fast的相关资料通常会有这样的表述：

注意，迭代器的快速失败行为不能得到保证，一般来说，存在非同步的并发修改时，不可能作出任何坚决的保证。快速失败迭代器尽最大努力抛出 ConcurrentModificationException。因此，编写依赖于此异常的程序的做法是错误的，正确做法是：迭代器的快速失败行为应该仅用于检测程序错误。

AbstractList

AbstractList作为许多List实现类的祖先类，它的fail-fast实现规则也被一些子类所继承（如ArrayList与LinkedList）。我们先来看一下AbstractList中迭代器触发fail-fast的机制：

private class Itr implements Iterator<E> {

    int expectedModCount = modCount;

    public void remove() {
        if (lastRet < 0)
            throw new IllegalStateException();
        checkForComodification();

        try {
            AbstractList.this.remove(lastRet);
            if (lastRet < cursor)
                cursor--;
            lastRet = -1;
            expectedModCount = modCount;
        } catch (IndexOutOfBoundsException e) {
            throw new ConcurrentModificationException();
        }
    }


    final void checkForComodification() {
        // 触发fail-fast
        if (modCount != expectedModCount)
            throw new ConcurrentModificationException();
    }
}

可以看到，触发的关键在于modCount与Itr中的字段expectedModCount。由于AbstractList默认实现是不可修改的，因此讨论modCount什么时候被修改、fail-fast什么时候被触发，还得放到它的子类当中来具体分析。

除了fail-fast，AbstractList还涉及到了另一个概念：视图。

// 以原list为底，返回sublist
public List<E> subList(int fromIndex, int toIndex) {
    return (this instanceof RandomAccess ?
            new RandomAccessSubList<>(this, fromIndex, toIndex) :
            new SubList<>(this, fromIndex, toIndex));
}
// 被随机访问接口RandomAccess标识的SubList
class RandomAccessSubList<E> extends SubList<E> implements RandomAccess {
    RandomAccessSubList(AbstractList<E> list, int fromIndex, int toIndex) {
        super(list, fromIndex, toIndex);
    }

    public List<E> subList(int fromIndex, int toIndex) {
        return new RandomAccessSubList<>(this, fromIndex, toIndex);
    }
}

subList方法返回原list的一部分，称之为视图。需要注意的是，视图以原list为底操作，任何对视图的修改都会反应到原list上，原list的任何变动也会对相关视图造成影响。

List<Integer> list = new ArrayList<>(Arrays.asList(1, 2, 3, 4, 5));
List<Integer> sub1 = list.subList(0, 3);
List<Integer> sub2 = list.subList(2, 5);
System.out.println(sub1); // [1, 2, 3]
System.out.println(sub2); // [3, 4, 5]

sub1.set(2,-1);
System.out.println(sub1); // [1, 2, -1]
System.out.println(sub2); // [-1, 4, 5]
System.out.println(list); // [1, 2, -1, 4, 5]

ArrayListAndLinkedList

说到ArrayList和LinkedList的区别，各种资料非常多。我们暂时先不管各种说法，单纯看看它们各自在java中的类定义：

// ArrayList实现随机访问接口RandomAccess
public class ArrayList<E> extends AbstractList<E>
        implements List<E>, RandomAccess, Cloneable, java.io.Serializable
{
    //...
}

// LinkedList继承AbstractList的子类AbstractSequentialList——顺序列表
public class LinkedList<E>
    extends AbstractSequentialList<E>
    implements List<E>, Deque<E>, Cloneable, java.io.Serializable
{
    // ...
}

所以，两者最明显的区别就显而易见了：ArrayList支持随机访问（通过下标随机访问元素），而LinkedList不行，只能通过顺序访问。
那么问题来了，实际使用中，LinkedList显然是可以通过get(int index)获取元素的，那为什么说LinkedList只支持顺序访问呢？

为了回答这个问题，先来看看两者get(int index)方法的实现：

// ArrayList
public E get(int index) {
    Objects.checkIndex(index, size);
    return elementData(index);
}

E elementData(int index) {
    // 返回下标对应值
    return (E) elementData[index];
}


// LinkedList
public E get(int index) {
    checkElementIndex(index);
    return node(index).item;
}

Node<E> node(int index) {
    // assert isElementIndex(index);

    if (index < (size >> 1)) {
        Node<E> x = first;
        // 从头部开始遍历，直至index
        for (int i = 0; i < index; i++)
            x = x.next;
        return x;
    } else {
        Node<E> x = last;
        // 从尾部开始遍历，直至index
        for (int i = size - 1; i > index; i--)
            x = x.prev;
        return x;
    }
}

原来LinkedList的get(int index)是通过遍历并计算下标才实现的，get方法的随机访问只是一个假象，ArrayList才是真正的随机访问。

接着，我们可以来看看ArrayList和LinkedList为什么会有这个区别了。总的来说，因为两者实现的数据结构不同：ArrayList采用数组实现，而LinkedList采取双向链表实现。数组实现的优势在于支持通过下标进行随机访问，通常情况下的顺序遍历速度也快于链表，但在发生结构化变动、尤其是从中间删除元素之时，整个数组的变动调整将会很大。而链表就没有这个风险了，只需要改动几个结点的引用罢了。

// ArrayList
public E remove(int index) {
    Objects.checkIndex(index, size);
    final Object[] es = elementData;

    @SuppressWarnings("unchecked") E oldValue = (E) es[index];
    fastRemove(es, index);

    return oldValue;
}

private void fastRemove(Object[] es, int i) {
    modCount++;
    final int newSize;
    if ((newSize = size - 1) > i)
        // i后部分前移一位
        System.arraycopy(es, i + 1, es, i, newSize - i);
    es[size = newSize] = null;
}

// LinkedList
public E remove(int index) {
    checkElementIndex(index);
    return unlink(node(index));
}

/**
 * Unlinks non-null node x.
 */
E unlink(Node<E> x) {
    // assert x != null;
    final E element = x.item;
    final Node<E> next = x.next;
    final Node<E> prev = x.prev;

    if (prev == null) {
        first = next;
    } else {
        prev.next = next;
        x.prev = null;
    }

    if (next == null) {
        last = prev;
    } else {
        next.prev = prev;
        x.next = null;
    }

    x.item = null;
    size--;
    modCount++;
    return element;
}

有点读者可能注意到了，上面的代码出现了一个前文提到过的字段：modcount。实际上，在ArrayList与LinkedList之中，只有发生了结构化的修改（如add、remove、sort方法），modcount才会更改。

现在，我们来仔细看看，Itr中的remove方法：

public void remove() {
    if (lastRet < 0)
        throw new IllegalStateException();
    checkForComodification();

    try {
        // 改变modcount
        AbstractList.this.remove(lastRet);
        if (lastRet < cursor)
            cursor--;
        lastRet = -1;
        // expectedModCount更新
        expectedModCount = modCount;
    } catch (IndexOutOfBoundsException e) {
        throw new ConcurrentModificationException();
    }
}

通常，我们使用List而不是Array的主要原因，在于List几乎不需要在意容量大小。某种程度上，List可以看做自动扩容的数组，尤其是本身就以数组实现的ArrayList。在ArrayList中，自动扩容的方案为：当不满足所需的最小容量时，将会先扩展至原容量的1.5倍。仍然不足，才会扩展至需求的最小容量。也因此，ArrayList的空间占用，往往是大于LinkedList的。

private void grow(int minCapacity) {
    // overflow-conscious code
    int oldCapacity = elementData.length;
    int newCapacity = oldCapacity + (oldCapacity >> 1);
    if (newCapacity - minCapacity < 0)
        newCapacity = minCapacity;
    if (newCapacity - MAX_ARRAY_SIZE > 0)
        newCapacity = hugeCapacity(minCapacity);
    // minCapacity is usually close to size, so this is a win:
    elementData = Arrays.copyOf(elementData, newCapacity);
}

private static int hugeCapacity(int minCapacity) {
    if (minCapacity < 0) // overflow
        throw new OutOfMemoryError();
    return (minCapacity > MAX_ARRAY_SIZE) ?
        Integer.MAX_VALUE :
        MAX_ARRAY_SIZE;
}

ci饭，碎觉，输入，输出