好的程序离不开好的数据结构的设计,在完成某些程序编写的时候,合理使用某些数据结构能做到事半功倍的效果。而针对Java语言而言,合理的数据结构设计就寄托在Java常见的集合上了。
从数据结构角度来看,数据结构可以有线性表、队列、栈、树、图等;从Java语言角度来看,可以有很多个层次,Java语言直接或间接的实现了上述所有的数据结构,主要集中在java.util包下Collection接口和Map接口,
比较全的一个Java集合层次图 ,摘自 。
下面列一下两个接口以及接口衍生的常用的抽象类和实现类。
Collection接口没有直接的实现类,而继续派生出了List和Set两个接口。List最主要特点:元素有序、可重复、并且其长度可变;而Set是不允许重复元素,并且不能保证有序性(无序)。
Map也就是以Key-Value存在的键值对集合。
Collection接口继承自Iterable接口,而Iterable中只包含一个返回Iterator<T>迭代器的iterator()方法。
public interface Iterable<T> {
/**
* Returns an iterator over a set of elements of type T.
*
* @return an Iterator.
*/
Iterator<T> iterator();
}
Iterator接口的源码如下:
/**
* An iterator over a collection. {@code Iterator} takes the place of
* {@link Enumeration} in the Java Collections Framework. Iterators
* differ from enumerations in two ways:
*
* <ul>
* <li> Iterators allow the caller to remove elements from the
* underlying collection during the iteration with well-defined
* semantics.
* <li> Method names have been improved.
* </ul>
*
* <p>This interface is a member of the
* <a href="{@docRoot}/../technotes/guides/collections/index.html">
* Java Collections Framework</a>.
*
* @param <E> the type of elements returned by this iterator
*
* @author Josh Bloch
* @see Collection
* @see ListIterator
* @see Iterable
* @since 1.2
*/
public interface Iterator<E> {
/**
* Returns {@code true} if the iteration has more elements.
* (In other words, returns {@code true} if {@link #next} would
* return an element rather than throwing an exception.)
*
* @return {@code true} if the iteration has more elements
*/
boolean hasNext();
/**
* Returns the next element in the iteration.
*
* @return the next element in the iteration
* @throws NoSuchElementException if the iteration has no more elements
*/
E next();
/**
* Removes from the underlying collection the last element returned
* by this iterator (optional operation). This method can be called
* only once per call to {@link #next}. The behavior of an iterator
* is unspecified if the underlying collection is modified while the
* iteration is in progress in any way other than by calling this
* method.
*
* @throws UnsupportedOperationException if the {@code remove}
* operation is not supported by this iterator
*
* @throws IllegalStateException if the {@code next} method has not
* yet been called, or the {@code remove} method has already
* been called after the last call to the {@code next}
* method
*/
void remove();
}
从代码中可以看到Iterator是jdk1.2后引入的,在1.2之前一直是Enumeration接口作为集合的迭代器。
/**
* An object that implements the Enumeration interface generates a
* series of elements, one at a time. Successive calls to the
* <code>nextElement</code> method return successive elements of the
* series.
* <p>
* For example, to print all elements of a <tt>Vector<E></tt> <i>v</i>:
* <pre>
* for (Enumeration<E> e = v.elements(); e.hasMoreElements();)
* System.out.println(e.nextElement());</pre>
* <p>
* Methods are provided to enumerate through the elements of a
* vector, the keys of a hashtable, and the values in a hashtable.
* Enumerations are also used to specify the input streams to a
* <code>SequenceInputStream</code>.
* <p>
* NOTE: The functionality of this interface is duplicated by the Iterator
* interface. In addition, Iterator adds an optional remove operation, and
* has shorter method names. New implementations should consider using
* Iterator in preference to Enumeration.
*
* @see java.util.Iterator
* @see java.io.SequenceInputStream
* @see java.util.Enumeration#nextElement()
* @see java.util.Hashtable
* @see java.util.Hashtable#elements()
* @see java.util.Hashtable#keys()
* @see java.util.Vector
* @see java.util.Vector#elements()
*
* @author Lee Boynton
* @since JDK1.0
*/
public interface Enumeration<E> {
/**
* Tests if this enumeration contains more elements.
*
* @return <code>true</code> if and only if this enumeration object
* contains at least one more element to provide;
* <code>false</code> otherwise.
*/
boolean hasMoreElements();
/**
* Returns the next element of this enumeration if this enumeration
* object has at least one more element to provide.
*
* @return the next element of this enumeration.
* @exception NoSuchElementException if no more elements exist.
*/
E nextElement();
}
Enumeration自jdk1.0就出现了,最初用于遍历类似于Vector,Hashtable这样的集合类。
public class EnumerationTest {
public static void main(String[] args) {
Vector<String> vector = new Vector();
vector.add("Java");
vector.add("Py");
vector.add("Go");
Enumeration enumeration = vector.elements();
while (enumeration.hasMoreElements()) {
System.out.println(enumeration.nextElement());
}
}
}
fail-fast机制
在创建迭代器之后,只能通过迭代器自身remove或add对迭代对象进行修改,否则在其他线程中以任何形式对其进行修改,迭代器马上会抛出异常,抛出ConcurrentModificationException异常。例如:
public class FailFastTest {
public static void main(String[] args) {
ArrayList<String> list = new ArrayList<>(Arrays.asList("a", "b", "c", "d", "e"));
Iterator<String> iterator = list.iterator();
while (iterator.hasNext()) {
String s = iterator.next();
if (s.equals("a")) {
list.remove(s);
}
}
System.out.println(list);
}
}
执行结果
上述代码中,直接通过list.remove()方法对list中的对象进行了更改,最后抛出了ConcurrentModificationException异常,为什么呢?我们通过源码分析
public Iterator<E> iterator() {
return new Itr();
}
/**
* An optimized version of AbstractList.Itr
*/
private class Itr implements Iterator<E> {
int cursor; // index of next element to return
int lastRet = -1; // index of last element returned; -1 if no such
int expectedModCount = modCount;
public boolean hasNext() {
return cursor != size;
}
@SuppressWarnings("unchecked")
public E next() {
checkForComodification();
int i = cursor;
if (i >= size)
throw new NoSuchElementException();
Object[] elementData = ArrayList.this.elementData;
if (i >= elementData.length)
throw new ConcurrentModificationException();
cursor = i + 1;
return (E) elementData[lastRet = i];
}
public void remove() {
if (lastRet < 0)
throw new IllegalStateException();
checkForComodification();
try {
ArrayList.this.remove(lastRet);
cursor = lastRet;
lastRet = -1;
expectedModCount = modCount;
} catch (IndexOutOfBoundsException ex) {
throw new ConcurrentModificationException();
}
}
final void checkForComodification() {
if (modCount != expectedModCount)
throw new ConcurrentModificationException();
}
}
我们以ArrayList为例:
内部类Itr中定义了一个变量expectedModCount (用来记录迭代器本身调用remove或者add方法时的期望的次数),初始化时让实际修改值modCount(其定义为ArrayList的成员变量)与expectedModCount一致,即当我们调用list.iterator()返回迭代器时,该字段被初始化为等于modCount。内部类Itr中next/remove方法都有调用checkForComodification()方法,在该方法中检测modCount == expectedModCount,如果不相当则抛出ConcurrentModificationException异常。
/**
* The number of times this list has been <i>structurally modified</i>.
* Structural modifications are those that change the size of the
* list, or otherwise perturb it in such a fashion that iterations in
* progress may yield incorrect results.
*
* <p>This field is used by the iterator and list iterator implementation
* returned by the {@code iterator} and {@code listIterator} methods.
* If the value of this field changes unexpectedly, the iterator (or list
* iterator) will throw a {@code ConcurrentModificationException} in
* response to the {@code next}, {@code remove}, {@code previous},
* {@code set} or {@code add} operations. This provides
* <i>fail-fast</i> behavior, rather than non-deterministic behavior in
* the face of concurrent modification during iteration.
*
* <p><b>Use of this field by subclasses is optional.</b> If a subclass
* wishes to provide fail-fast iterators (and list iterators), then it
* merely has to increment this field in its {@code add(int, E)} and
* {@code remove(int)} methods (and any other methods that it overrides
* that result in structural modifications to the list). A single call to
* {@code add(int, E)} or {@code remove(int)} must add no more than
* one to this field, or the iterators (and list iterators) will throw
* bogus {@code ConcurrentModificationExceptions}. If an implementation
* does not wish to provide fail-fast iterators, this field may be
* ignored.
*/
protected transient int modCount = 0;
而集合的add和remove操作都会对modCount进行+1。
在上述的异常代码中,在迭代过程中,执行list.remove(s),使得modCount+1,进入下一次循环时,执行 it.next(),checkForComodification方法发现modCount != expectedModCount,则抛出异常。而将list.remove(s)更改为迭代器删除则不会抛出异常,如下:
public class FailFastTest {
public static void main(String[] args) {
ArrayList<String> list = new ArrayList<>(Arrays.asList("a", "b", "c", "d", "e"));
Iterator<String> iterator = list.iterator();
while (iterator.hasNext()) {
String s = iterator.next();
if (s.equals("a")) {
iterator.remove();
}
}
System.out.println(list);
}
}
与fail-fast对应的fail-safe概念
顾名思义,fail-safe即不会抛出ConcurrentModificationException异常,java.util包中的所有集合类都被设计为fail-fast的,而java.util.concurrent中的集合类都为fail-safe的。
因此采用java.util.concurrent包下替代不安全的ArrayList同样可以解决fail-fast问题。
public class FailFastTest {
public static void main(String[] args) {
CopyOnWriteArrayList list = new CopyOnWriteArrayList(Arrays.asList("a", "b", "c", "d", "e"));
Iterator<String> iterator = list.iterator();
while (iterator.hasNext()) {
String s = iterator.next();
if (s.equals("a")) {
list.remove(s);
}
}
System.out.println(list);
}
}