转载

Java：HashSet与HashMap

我有一个程序处理大量数据集.由于程序一直在寻找容器中的对象,因此最好将对象存储在散列实现的容器中.

第一个想法是使用HashMap,因为获取和删除此容器的方法更适合我需要的用途.

但是,我开始看到HashMap的使用是相当大的内存消耗品,这是一个主要问题,所以我认为切换到HashSet会更好,因为它只使用<E>,而不是<K,V>每个元素,但当我看到我学到的实现它使用底层的HashMap！这意味着它不会节省任何记忆！

所以这是我的问题：

>我所有的假设都是真的吗？

> HashMap内存是否浪费？更具体地说,每个条目的开销是多少？

> HashSet和HashMap一样浪费吗？

>是否有任何其他基于哈希的容器将显着减少内存耗材？

更新

根据评论中的要求,我将对我的程序进行一些扩展,hashMap意味着保存一对其他对象,以及一些数值 – 一个浮点数 – 从它们计算出来.一路上它提取了一些并进入新的对.给定一对它需要确保它不会保持这对或移除它.可以使用float对象的float值或hashCode来完成映射.

另外,当我说“巨大的数据集”时,我说的是~4 * 10 ^ 9个对象

有关 this site 中有关集合性能的非常有用的提示.

HashSet is built on top of a HashMap< T, Object > , where value is a

singleton ‘present’ object. It means that the memory consumption of aHashSet is identical to HashMap : in order to store SIZE values, you need 32 * SIZE + 4 * CAPACITY bytes (plus size of your values). It is definitely not a memory-friendly collection.

07001 could be the easiest replacement collection for a HashSet – it implements Set and Iterable, which means you should just update a single letter in the initialization of your set.

THashSet uses a single object array for its values, so it uses 4 * CAPACITY bytes for storage. As you can see, compared to JDK HashSet, you will save 32 * SIZE bytes in case of the identical load factor, which is a huge improvement.

另外,我从 here 拍摄的下图可以帮助我们记住选择合适的收藏品

翻译自：https://stackoverflow.com/questions/28261457/java-hashset-vs-hashmap

原文 https://codeday.me/bug/20190112/528283.html

正文到此结束