Java8 ConcurrentHashMap 存储、扩容源码阅读

精选原创

syrdbt 2024-06-10 14:34:52 博主文章分类：----Java ©著作权

文章标签 哈希算法散列表 java 数据数组 文章分类 Html/CSS 前端开发

©著作权归作者所有：来自51CTO博客作者syrdbt的原创作品，请联系作者获取转载授权，否则将追究法律责任

文章目录

1. 概述
2. 入门实例
3. 属性
4. 核心方法

4.1 put
4.2 initTable
4.3 transfer
4.4 sizeCtl
4.5 sizeCtl bug

1. 概述

ConcurrentHashMap 是线程安全且高效的 HashMap。

HashMap 可以看下我这篇传送门。

2. 入门实例

public class MyStudy {
    public static void main(String[] args) {
        ConcurrentHashMap<Integer, String> concurrentHashMap = new ConcurrentHashMap<>();
        // 放入一个键值对，其中键为 1，值为 "1"
        // 该操作是线程安全的，可以在并发环境下使用而不会导致数据不一致问题。
        concurrentHashMap.put(1, "1");
        System.out.println(concurrentHashMap.get(1));

        // 如果该键已存在，则不会替换原有值。
        concurrentHashMap.putIfAbsent(1, "2");
        System.out.println(concurrentHashMap.get(1));
    }
}

运行截图：

Java8 ConcurrentHashMap 存储、扩容源码阅读_java

HashMap 线程不安全的：

public class MyStudy {
    public static void main(String[] args) throws InterruptedException {
        for (int count = 0; count < 10; count++) {
            Map<Integer, String> hashMap = new HashMap<Integer, String>();
            new Thread(() -> {
                for (int i = 0; i < 10000; i++) {
                    hashMap.put(i, i + "a");
                }
            }).start();
            new Thread(() -> {
                for (int i = 0; i < 10000; i++) {
                    hashMap.put(i, i + "b");
                }
            }).start();
            TimeUnit.SECONDS.sleep(3);
            System.out.println(hashMap.size());
        }
    }
}

运行截图, 两个线程并发数量都不对了

Java8 ConcurrentHashMap 存储、扩容源码阅读_散列表_02

测试一下 ConcurrentHashMap 线程安全

public class MyStudy {
    public static void main(String[] args) throws InterruptedException {
        for (int count = 0; count < 10; count++) {
            Map<Integer, String> map = new ConcurrentHashMap<Integer, String>();
            new Thread(() -> {
                for (int i = 0; i < 10000; i++) {
                    map.put(i, i + "a");
                }
            }).start();
            new Thread(() -> {
                for (int i = 0; i < 10000; i++) {
                    map.put(i, i + "b");
                }
            }).start();
            TimeUnit.SECONDS.sleep(3);
            System.out.println(map.size());
        }
    }
}

运行截图, 数量对的，接下来我们看下源码

Java8 ConcurrentHashMap 存储、扩容源码阅读_java_03

3. 属性

table存储数据，Node数组，Node可以是链表或红黑树，数据在 table中的下标计算规则 (n - 1) & hash。
nextTable 是 table 扩容 transfer 之后的数据，这样扩容过程中 get 操作，不受扩容影响。

transient volatile Node<K,V>[] table;
private transient volatile Node<K,V>[] nextTable;

4. 核心方法

4.1 put

计算hash值 : int hash = spread(key.hashCode());
如果tab为空，初始化 : initTable()
tab中该hash的位置没有数据, CAS线程安全放入数据 casTabAt(tab, i, null, new Node<K,V>(hash, key, value, null))
tab中该hash的位置有数据, 但是数据正在扩容转移, 当前线程帮助转移数据到扩容的新数组 tab = helpTransfer(tab, f);
tab中该hash的位置有数据, 数据没在扩容转移, synchronized 锁住当前节点, 把数据放进去 synchronized (f)
addCount(1L, binCount); 新增一条数据, 如果需要扩容, 对数据进行转移 void transfer(Node<K,V>[] tab, Node<K,V>[] nextTab)

put 源码如下:

public V put(K key, V value) {
    return putVal(key, value, false);
}                                

final V putVal(K key, V value, boolean onlyIfAbsent) {
    if (key == null || value == null) throw new NullPointerException();
    // 计算hash值
    int hash = spread(key.hashCode());
    int binCount = 0;
    for (Node<K,V>[] tab = table;;) {
        Node<K,V> f; int n, i, fh;
       if (tab == null || (n = tab.length) == 0)
            // 如果tab为空，初始化
            tab = initTable();
        else if ((f = tabAt(tab, i = (n - 1) & hash)) == null) {
            // tab中该hash的位置没有数据, CAS安全放入数据
            if (casTabAt(tab, i, null,
                         new Node<K,V>(hash, key, value, null)))
                break;                   // no lock when adding to empty bin
        }
        else if ((fh = f.hash) == MOVED)
            // 槽点是转移节点(正在扩容)，当前线程帮助转移扩容
            tab = helpTransfer(tab, f);
        else {
            // 槽点有值的，先锁定当前槽点，保证其余线程不能操作
            // 如果是链表，新增值到链表的尾部
            // 如果是红黑树，使用红黑树新增的方法新增；
            V oldVal = null;
            // 锁定槽节点
            synchronized (f) {
                // 检查数据被修改了没
                if (tabAt(tab, i) == f) {
                    // 链表
                    if (fh >= 0) {
                        binCount = 1;
                        for (Node<K,V> e = f;; ++binCount) {
                            K ek;
                            if (e.hash == hash &&
                                ((ek = e.key) == key ||
                                 (ek != null && key.equals(ek)))) {
                                // 旧值
                                oldVal = e.val;
                                if (!onlyIfAbsent)
                                    e.val = value;
                                break;
                            }
                            // 不存在 放到链表的最后
                            Node<K,V> pred = e;
                            if ((e = e.next) == null) {
                                pred.next = new Node<K,V>(hash, key,
                                                          value, null);
                                break;
                            }
                        }
                    }
                    // 红黑树
                    else if (f instanceof TreeBin) {
                        Node<K,V> p;
                        binCount = 2;
                        if ((p = ((TreeBin<K,V>)f).putTreeVal(hash, key,
                                                       value)) != null) {
                            // 旧值
                            oldVal = p.val;
                            if (!onlyIfAbsent)
                                p.val = value;
                        }
                    }
                }
            }
            // binCount不为0
            if (binCount != 0) {
                // 如果大于阈值
                if (binCount >= TREEIFY_THRESHOLD)
                    // 树化
                    treeifyBin(tab, i);
                if (oldVal != null)
                    return oldVal;
                break;
            }
        }
    }
    // binCount+1， 如果需要扩容, 对数据进行转移
    addCount(1L, binCount);
    return null;
}

4.2 initTable

initTable 初始化数组:

private final Node<K,V>[] initTable() {
    Node<K,V>[] tab; int sc;
    while ((tab = table) == null || tab.length == 0) {
        // sizeCtl小于0, 有线程正在初始化, 释放当前CPU调度权
        if ((sc = sizeCtl) < 0)
            Thread.yield(); // lost initialization race; just spin
        // CAS 赋值sizeCtl为-1，防止其他线程并发操作
        else if (U.compareAndSwapInt(this, SIZECTL, sc, -1)) {
            try {
                // 双重check, tabel不为null说明没有初始化
                if ((tab = table) == null || tab.length == 0) {
                    int n = (sc > 0) ? sc : DEFAULT_CAPACITY;
                    @SuppressWarnings("unchecked")
                    Node<K,V>[] nt = (Node<K,V>[])new Node<?,?>[n];
                    table = tab = nt;
                    sc = n - (n >>> 2);
                }
            } finally {
                sizeCtl = sc;
            }
            break;
        }
    }
    return tab;
}

4.3 transfer

根据CPU核心数量 NCPU 和最小步幅 MIN_TRANSFER_STRIDE，算出步幅 stride（每次转移数据数量的数量）
如果新数组为空，初始化，大小为原数组的两倍，(Node<K,V>[])new Node<?,?>[n << 1]
transferIndex 转移下标，transferIndex = n 数据从 table 最后一个元素开始转移， i = nextIndex - 1记录一下要开始转移的下标，bound 这次转移的底部 bound=transferIndex-stride, 修改 transferIndex 为 bound
--i >= bound，当前线程开始转移每一个下标的数据，每个线程都可以走上面的流程算出自己的下标 i 和 bound，转移自己的这个区间， transferIndex 是通过 CAS修改的，保证不会重复执行
最后一个线程执行完，finishing = advance = true; i = n; // recheck before commit 重新检查一遍
检查完成，开始使用转移后的nextTab table = nextTab; sizeCtl = (n << 1) - (n >>> 1);

private final void transfer(Node<K,V>[] tab, Node<K,V>[] nextTab) {
    // 老数组的长度
    int n = tab.length, stride;

    //  根据CPU核心数量 `NCPU` 和 最小步幅 `MIN_TRANSFER_STRIDE`，算出步幅 `stride`（每次转移数据数量的数量）
    if ((stride = (NCPU > 1) ? (n >>> 3) / NCPU : n) < MIN_TRANSFER_STRIDE)
        stride = MIN_TRANSFER_STRIDE; // subdivide range
    
    // 如果新数组为空，初始化，大小为原数组的两倍，n << 1
    if (nextTab == null) { // initiating
        try {
            @SuppressWarnings("unchecked")
            Node<K,V>[] nt = (Node<K,V>[])new Node<?,?>[n << 1];
            nextTab = nt;
        } catch (Throwable ex) { // try to cope with OOME
            sizeCtl = Integer.MAX_VALUE;
            return;
        }
        nextTable = nextTab;

        // 数据从 `table` 最后一个元素开始转移
        transferIndex = n;
    }
    
    // 新数组的长度
    int nextn = nextTab.length;

    // 转移节点
    ForwardingNode<K,V> fwd = new ForwardingNode<K,V>(nextTab);

    boolean advance = true;
    boolean finishing = false; // to ensure sweep before committing nextTab
    
    // 无限自旋，i 的值会从原数组的最大值开始，慢慢递减到 0
    for (int i = 0, bound = 0;;) {
        Node<K,V> f; int fh;
        
        while (advance) {
            int nextIndex, nextBound;
            // --i 当前线程开始转移每一个下标的数据
            if (--i >= bound || finishing)
                advance = false;
                // 转移完成
            else if ((nextIndex = transferIndex) <= 0) {
                i = -1;
                advance = false;
            }
            // 根据步幅算出边界bound
            else if (U.compareAndSwapInt
                     (this, TRANSFERINDEX, nextIndex,
                      nextBound = (nextIndex > stride ?
                                   nextIndex - stride : 0))) {
                bound = nextBound;
                // `i = nextIndex - 1`记录一下要开始转移的下标
                i = nextIndex - 1;
                advance = false;
            }
        }
        
        // 转移结束
        if (i < 0 || i >= n || i + n >= nextn) {
            int sc;
            if (finishing) {
                // 检查完成，开始使用转移后的nextTab
                nextTable = null;
                table = nextTab;
                sizeCtl = (n << 1) - (n >>> 1);
                return;
            }
            if (U.compareAndSwapInt(this, SIZECTL, sc = sizeCtl, sc - 1)) {
                if ((sc - 2) != resizeStamp(n) << RESIZE_STAMP_SHIFT)
                    return;
                finishing = advance = true;
                // 重新检查一遍
                i = n; // recheck before commit
            }
        }
        // 节点为空 设置为转移节点
        else if ((f = tabAt(tab, i)) == null)
            advance = casTabAt(tab, i, null, fwd);
        // 节点已经转移     
        else if ((fh = f.hash) == MOVED)
            advance = true; // already processed
        // 节点有值 拷贝转移
        else {
            synchronized (f) {
                if (tabAt(tab, i) == f) {
                    Node<K,V> ln, hn;
                    if (fh >= 0) {
                        int runBit = fh & n;
                        Node<K,V> lastRun = f;
                        for (Node<K,V> p = f.next; p != null; p = p.next) {
                            int b = p.hash & n;
                            if (b != runBit) {
                                runBit = b;
                                lastRun = p;
                            }
                        }
                        if (runBit == 0) {
                            ln = lastRun;
                            hn = null;
                        }
                        else {
                            hn = lastRun;
                            ln = null;
                        }
                        // 如果节点只有单个数据，直接拷贝，如果是链表，循环多次组成链表拷贝
                        for (Node<K,V> p = f; p != lastRun; p = p.next) {
                            int ph = p.hash; K pk = p.key; V pv = p.val;
                            if ((ph & n) == 0)
                            ln = new Node<K,V>(ph, pk, pv, ln);
                            else
                            hn = new Node<K,V>(ph, pk, pv, hn);
                        }
                        // 在新数组位置上放置拷贝的值
                        setTabAt(nextTab, i, ln);
                        setTabAt(nextTab, i + n, hn);
                        // 在老数组位置上放上 ForwardingNode 节点
                        // put 时，发现是 ForwardingNode 节点，就不会再动这个节点的数据了
                        setTabAt(tab, i, fwd);
                        advance = true;
                    }
                    // 红黑树的拷贝
                    else if (f instanceof TreeBin) {
                        // ...
                    }
                }
            }
        }
    } 
}

4.4 sizeCtl

sizeCtl 有这几种情况：

sizeCtl > 0，容器容量， this.sizeCtl = DEFAULT_CAPACITY;，
sizeCtl = 0，默认初始值
sizeCtl = -1，表示table正在初始化 initTable() 中的 U.compareAndSwapInt(this, SIZECTL, sc, -1)
sizeCtl < -1 容器正在扩容；高16位存储扩容版本号，低16位代表着有n-1个线程正在参与扩容。

扩容的时候 sizeCtl 比较复杂，以 32 -> 64 为例， n = 32：

第一步 int rs = resizeStamp(n); rs = 1000 0000 0001 1010
低16位代表着有n-1个线程正在参与扩容 U.compareAndSwapInt(this, SIZECTL, sc, (rs << RESIZE_STAMP_SHIFT) + 2) ， SIZECTL变成负数 1000 0000 0001 1010 0000 0000 0000 00010， +2不是+1, 因为转移完还要检查一遍, 多出的1就是检查这一次
高16位存储扩容版本号，版本号检查用到的地方： ((sc - 2) != resizeStamp(n) << RESIZE_STAMP_SHIFT)、(sc >>> RESIZE_STAMP_SHIFT) != rs