深入剖析 Java 7 中的 HashMap 和 ConcurrentHashMap

日期：2020-06-10 栏目：程序人生浏览：次

本文将深入剖析 Java7 中的 HashMap 和 ConcurrentHashMap 的源码，解析 HashMap 线程不安全的原理以及解决方案，最后以测试用例加以验证。

1 Java7 HashMap

HashMap 的数据结构：

深入剖析 Java 7 中的 HashMap 和 ConcurrentHashMap

从上图中可以看出，HashMap 底层就是一个数组结构，数组中的每一项又是一个链表。

通过查看 JDK 中的 HashMap 源码，可以看到其构造函数有一行代码：

public HashMap(int initialCapacity, float loadFactor) { ... table = new Entry[capacity]; ... }

即创建了一个大小为 capacity 的 Entry 数组，而 Entry 的结构如下：

static class Entry<K,V> implements Map.Entry<K,V> { final K key; V value; Entry<K,V> next; final int hash; …… }

可以看到，Entry 是一个 static class，其中包含了 key 和 value ，也就是键值对，另外还包含了一个 next 的 Entry 指针。

capacity：当前数组容量，始终保持 2^n，可以扩容，扩容后数组大小为当前的 2 倍。默认初始容量为 16。

loadFactor：负载因子，默认为 0.75。

threshold：扩容的阈值，等于 capacity * loadFactor

1.1 put过程分析 public V put(K key, V value) { // 当插入第一个元素的时候，需要先初始化数组大小 if (table == EMPTY_TABLE) { inflateTable(threshold); } // 如果 key 为 null，则这个 entry 放到 table[0] 中 if (key == null) return putForNullKey(value); // key 的 hash 值 int hash = hash(key); // 找到对应的数组下标 int i = indexFor(hash, table.length); // 遍历一下对应下标处的链表，看是否有重复的 key 已经存在， // 如果有，直接覆盖，put 方法返回旧值就结束了 for (Entry<K,V> e = table[i]; e != null; e = e.next) { Object k; if (e.hash == hash && ((k = e.key) == key || key.equals(k))) { V oldValue = e.value; e.value = value; e.recordAccess(this); return oldValue; } } modCount++; // 不存在重复的 key，将此 entry 添加到链表中 addEntry(hash, key, value, i); return null; }

这里对一些方法做深入解析。

数组初始化

private void inflateTable(int toSize) { // 保证数组大小一定是 2^n int capacity = roundUpToPowerOf2(toSize); // 计算扩容阈值 threshold = (int) Math.min(capacity * loadFactor, MAXIMUM_CAPACITY + 1); // 初始化数组 table = new Entry[capacity]; initHashSeedAsNeeded(capacity); }

找到对应的数组下标

static int indexFor(int hash, int length) { // 作用等价于取模运算，但这种方式效率更高 return hash & (length-1); }

因为HashMap的底层数组长度总是 2^n，当 length 为 2 的 n 次方时，hash & (length-1) 就相当于对length取模，而且速度比直接取模要快的多。

添加节点到链表中

void addEntry(int hash, K key, V value, int bucketIndex) { // 如果当前 HashMap 大小已经达到了阈值，并且新值要插入的数组位置已经有元素了，那么要扩容 if ((size >= threshold) && (null != table[bucketIndex])) { // 扩容 resize(2 * table.length); // 重新计算 hash 值 hash = (null != key) ? hash(key) : 0; // 计算扩容后的新的下标 bucketIndex = indexFor(hash, table.length); } createEntry(hash, key, value, bucketIndex); } // 永远都是在链表的表头添加新元素 void createEntry(int hash, K key, V value, int bucketIndex) { // 获取指定 bucketIndex 索引处的 Entry Entry<K,V> e = table[bucketIndex]; // 将新创建的 Entry 放入 bucketIndex 索引处，并让新的 Entry 指向原来的 Entry table[bucketIndex] = new Entry<>(hash, key, value, e); size++; }

当系统决定存储 HashMap 中的 key-value 对时，完全没有考虑 Entry 中的 value，仅仅只是根据 key 来计算并决定每个 Entry 的存储位置。我们完全可以把 Map 集合中的 value 当成 key 的附属，当系统决定了 key 的存储位置之后，value 随之保存在那里即可。

数组扩容

随着 HashMap 中元素的数量越来越多，发生碰撞的概率将越来越大，所产生的子链长度就会越来越长，这样势必会影响 HashMap 的存取速度。为了保证 HashMap 的效率，系统必须要在某个临界点进行扩容处理，该临界点 threshold。而在 HashMap 数组扩容之后，最消耗性能的点就出现了：原数组中的数据必须重新计算其在新数组中的位置，并放进去，这就是 resize。

void resize(int newCapacity) { Entry[] oldTable = table; int oldCapacity = oldTable.length; // 若 oldCapacity 已达到最大值，直接将 threshold 设为 Integer.MAX_VALUE if (oldCapacity == MAXIMUM_CAPACITY) { threshold = Integer.MAX_VALUE; return; // 直接返回 } // 否则，创建一个更大的数组 Entry[] newTable = new Entry[newCapacity]; //将每条Entry重新哈希到新的数组中 transfer(newTable, initHashSeedAsNeeded(newCapacity)); table = newTable; // 重新设定 threshold threshold = (int)Math.min(newCapacity * loadFactor, MAXIMUM_CAPACITY + 1); } 1.2 get过程分析 public V get(Object key) { // key 为 null 的话，会被放到 table[0]，所以只要遍历下 table[0] 处的链表就可以了 if (key == null) return getForNullKey(); // key 非 null 的情况，详见下文 Entry<K,V> entry = getEntry(key); return null == entry ? null : entry.getValue(); } final Entry<K,V> getEntry(Object key) { // The number of key-value mappings contained in this map. if (size == 0) { return null; } // 根据该 key 的 hashCode 值计算它的 hash 码 int hash = (key == null) ? 0 : hash(key); // 确定数组下标，然后从头开始遍历链表，直到找到为止 for (Entry<K,V> e = table[indexFor(hash, table.length)]; e != null; e = e.next) { Object k; //若搜索的key与查找的key相同，则返回相对应的value if (e.hash == hash && ((k = e.key) == key || (key != null && key.equals(k)))) return e; } return null; } 2 Java7 ConcurrentHashMap

ConcurrentHashMap 的成员变量中，包含了一个 Segment 数组 final Segment<K,V>[] segments;，而 Segment 是ConcurrentHashMap 的内部类。

然后在 Segment 这个类中，包含了一个 HashEntry 的数组transient volatile HashEntry<K,V>[] table，而 HashEntry 也是 ConcurrentHashMap 的内部类。

转载注明出处：https://www.heiqu.com/b40da9553d125ecc65b680b112e480e0.html

深入剖析 Java 7 中的 HashMap 和 ConcurrentHashMap

相关推荐