转载

Java容器–2–HashMap

hash表

为了达到查找效率接近于O（1），提出了hash算法的概念。

hash算法，核心就是，关键字是K的字，存储到H(K)的位置。

即使存储方法，也是查找方法。

hash函数构造方法

确定性：H（key）直与key有关，同其他无关。

便于计算

满射，可以全部概率映射到hash表。使得hash表

均匀分布最重要。映射的均匀，越好。

H(key) = a*key+b
数值分析法：从key中抽取部分位数来获取。
平方取中法：那关键字的编码，编码平方，取中间数据
折叠法：适用于关键字位数很多
基数转换法， r1 和 r2 把大数从r1 进制换成r2 r1>r2
除数留余法：h(key) = k mod p

hash算法，hash冲突是什么？

一般的hash算法，都是压缩key的范围，所以一定会存在冲突的可能。

hash冲突就是 2个不同的key值，存在H（key）相同的可能性。

为什么有hash冲突，怎么解决？

闭散列方法：在hash表的内部，保存冲突的地址。
- 线性探测法
- 二次探测法
- 伪随机探测法
- 再哈希法
开散列法：在hsah表的外部，保存冲突地址。
- 链地址法
- 公共溢出法

链表为什么是8，转换成红黑树？

如下：根据珀松分布，大于8的时候，重复的概率极低。

laodFactor 为什么是0.75f

负载因子越大，空间利用越多，冲突概率越高，查询效率越低。
负载因子越小，空间利用越小，冲突概率越低，查询效率越高。
空间和时间的一个平衡。
hash冲突和空间利用的一个折中
简单翻译一下就是在理想情况下,使用随机哈希码,节点出现的频率在hash桶中遵循泊松分布，同时给出了桶中元素个数和概率的对照表。
从上面的表中可以看到当桶中元素到达8个的时候，概率已经变得非常小，也就是说用0.75作为加载因子，每个碰撞位置的链表长度超过８个是几乎不可能的。
好了，再深挖就要挖到统计学那边去了，就此打住，重申一下使用hash容器请尽量指定初始容量，且是2的幂次方。
initCap 为什么是16

java中的hash函数是除数留余法

X % 2^n = X & (2^n – 1)

为啥是2的n次方，就是可以把除法优化成位算法，更快。

16是一个经验值。

hashmap实现？

关于负载因子，默认容量等都已经说明了。

看下具体实现

hash函数的实现

static final int hash(Object key) {
        int h;
        return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);
    }

获取key的hashcode，并且和h>>>16

h>>>16的作用是降低hash冲突。

static int indexFor(int h, int length) {
    return h & (length-1);
}

这个是java7里面的代码，就是通过hashcode，和hash桶的size，获取hash值在桶里面的位置。也就是某个key存储的位置。

hash桶

transient Node<K,V>[] table;
static class Node<K,V> implements Map.Entry<K,V> {
        final int hash;
        final K key;
        V value;
        Node<K,V> next;

table就是hash桶，大小是2的指数。

Node就是每个hash桶存储的Node，Node可以是链表的形式。

get

final Node<K,V> getNode(int hash, Object key) {
        Node<K,V>[] tab; Node<K,V> first, e; int n; K k;
        if ((tab = table) != null && (n = tab.length) > 0 &&
            (first = tab[(n - 1) & hash]) != null) {
            if (first.hash == hash && // always check first node
                ((k = first.key) == key || (key != null && key.equals(k))))
                return first;
            if ((e = first.next) != null) {
                if (first instanceof TreeNode)
                    return ((TreeNode<K,V>)first).getTreeNode(hash, key);
                do {
                    if (e.hash == hash &&
                        ((k = e.key) == key || (key != null && key.equals(k))))
                        return e;
                } while ((e = e.next) != null);
            }
        }
        return null;
    }

逻辑很清晰，

找到hash桶里面，key hash对应的位置。这里先不管，等到put的时候在分析。
check first对不对
first next 是tree还是链表。tree用tree的方式获取，node用node的方式比对。
没有找到，return null

put

/**
     * Implements Map.put and related methods.
     *
     * @param hash hash for key
     * @param key the key
     * @param value the value to put
     * @param onlyIfAbsent if true, don t change existing value
     * @param evict if false, the table is in creation mode.
     * @return previous value, or null if none
     */
    final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
                   boolean evict) {
        Node<K,V>[] tab; Node<K,V> p; int n, i;
        if ((tab = table) == null || (n = tab.length) == 0)
            n = (tab = resize()).length;
        if ((p = tab[i = (n - 1) & hash]) == null)
            tab[i] = newNode(hash, key, value, null);
        else {
            Node<K,V> e; K k;
            if (p.hash == hash &&
                ((k = p.key) == key || (key != null && key.equals(k))))
                e = p;
            else if (p instanceof TreeNode)
                e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
            else {
                for (int binCount = 0; ; ++binCount) {
                    if ((e = p.next) == null) {
                        p.next = newNode(hash, key, value, null);
                        if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
                            treeifyBin(tab, hash);
                        break;
                    }
                    if (e.hash == hash &&
                        ((k = e.key) == key || (key != null && key.equals(k))))
                        break;
                    p = e;
                }
            }
            if (e != null) { // existing mapping for key
                V oldValue = e.value;
                if (!onlyIfAbsent || oldValue == null)
                    e.value = value;
                afterNodeAccess(e);
                return oldValue;
            }
        }
        ++modCount;
        if (++size > threshold)
            resize();
        afterNodeInsertion(evict);
        return null;
    }

i = (n - 1) & hash 很熟悉的代码，这个就是上面java7里面的方式indexof，也就是通过hash值，转为hash桶里面的位置的方法。

还有一点就是，当链表长度超过8 的时候，为了查询效率，改为tree来代替。

原文 http://www.demanmath.com/index.php/2020/07/22/javarongqi-2-hashmap/

正文到此结束