In this chapter, we will focus on the implementation of a HashSet and a HashMap. 在本章中,我们将重点介绍HashSet和HashMap的实现。
Hashing is a technique used to uniquely identify a specific object from a group of similar objects. 哈希是一种技术,用于从一组相似的对象中唯一识别特定对象。
In Node.js, we can implement a simple HashSet using an array and a hash function. 在Node.js中,我们可以使用数组和哈希函数来实现一个简单的HashSet。
class HashSet {
constructor() {
this.bucket = [];
}
// Hash function
// 哈希函数
hash(value) {
return value % 1000;
}
// Add value to the set
// 向集合中添加值
add(value) {
const index = this.hash(value);
if (!this.bucket[index]) {
this.bucket[index] = [];
}
if (!this.bucket[index].includes(value)) {
this.bucket[index].push(value);
}
}
// Check if value is in the set
// 检查值是否在集合中
has(value) {
const index = this.hash(value);
return this.bucket[index] && this.bucket[index].includes(value);
}
// Remove value from the set
// 从集合中移除值
remove(value) {
const index = this.hash(value);
if (this.bucket[index]) {
const valueIndex = this.bucket[index].indexOf(value);
if (valueIndex !== -1) {
this.bucket[index].splice(valueIndex, 1);
}
}
}
}
// Example Usage
// 示例用法
const set = new HashSet();
set.add(1);
set.add(2);
set.add(2); // Duplicate element
console.log(set.has(1)); // true
console.log(set.has(3)); // false
set.remove(2);
console.log(set.has(2)); // false
In Python, we can implement a simple HashSet using a list and a hash function. 在Python中,我们可以使用列表和哈希函数来实现一个简单的HashSet。
class HashSet:
def __init__(self):
self.bucket = [[] for _ in range(1000)]
# Hash function
# 哈希函数
def hash(self, value):
return value % 1000
# Add value to the set
# 向集合中添加值
def add(self, value):
index = self.hash(value)
if value not in self.bucket[index]:
self.bucket[index].append(value)
# Check if value is in the set
# 检查值是否在集合中
def contains(self, value):
index = self.hash(value)
return value in self.bucket[index]
# Remove value from the set
# 从集合中移除值
def remove(self, value):
index = self.hash(value)
if value in self.bucket[index]:
self.bucket[index].remove(value)
# Example Usage
# 示例用法
set = HashSet()
set.add(1)
set.add(2)
set.add(2) # Duplicate element
print(set.contains(1)) # True
print(set.contains(3)) # False
set.remove(2)
print(set.contains(2)) # False
In Node.js, we can implement a simple HashMap using an array of objects and a hash function. 在Node.js中,我们可以使用对象数组和哈希函数来实现一个简单的HashMap。
class HashMap {
constructor() {
this.bucket = [];
}
// Hash function
// 哈希函数
hash(key) {
let hash = 0;
for (let char of key.toString()) {
hash = (hash << 5) - hash + char.charCodeAt(0);
hash |= 0;
}
return hash % 1000;
}
// Set key-value pair
// 设置键值对
set(key, value) {
const index = this.hash(key);
if (!this.bucket[index]) {
this.bucket[index] = [];
}
for (let pair of this.bucket[index]) {
if (pair[0] === key) {
pair[1] = value;
return;
}
}
this.bucket[index].push([key, value]);
}
// Get value by key
// 通过键获取值
get(key) {
const index = this.hash(key);
if (this.bucket[index]) {
for (let pair of this.bucket[index]) {
if (pair[0] === key) {
return pair[1];
}
}
}
return undefined;
}
// Remove key-value pair
// 移除键值对
delete(key) {
const index = this.hash(key);
if (this.bucket[index]) {
for (let i = 0; i < this.bucket[index].length; i++) {
if (this.bucket[index][i][0] === key) {
this.bucket[index].splice(i, 1);
return;
}
}
}
}
}
// Example Usage
// 示例用法
const map = new HashMap();
map.set('a', 1);
map.set('b', 2);
map.set('a', 3); // Update value for key 'a'
console.log(map.get('a')); // 3
console.log(map.get('c')); // undefined
map.delete('b');
console.log(map.get('b')); // undefined
In Python, we can implement a simple HashMap using a list of lists and a hash function. 在Python中,我们可以使用列表和哈希函数来实现一个简单的HashMap。
class HashMap:
def __init__(self):
self.bucket = [[] for _ in range(1000)]
# Hash function
# 哈希函数
def hash(self, key):
hash = 0
for char in str(key):
hash = (hash << 5) - hash + ord(char)
hash &= 0xFFFFFFFF
return hash % 1000
# Set key-value pair
# 设置键值对
def set(self, key, value):
index = self.hash(key)
for pair in self.bucket[index]:
if pair[0] == key:
pair[1] = value
return
self.bucket[index].append([key, value])
# Get value by key
# 通过键获取值
def get(self, key):
index = self.hash(key)
for pair in self.bucket[index]:
if pair[0] == key:
return pair[1]
return None
# Remove key-value pair
# 移除键值对
def remove(self, key):
index = self.hash(key)
for i, pair in enumerate(self.bucket[index]):
if pair[0] == key:
del self.bucket[index][i]
return
# Example Usage
# 示例用法
map = HashMap()
map.set('a', 1)
map.set('b', 2)
map.set('a', 3) # Update value for key 'a'
print(map.get('a')) # 3
print(map.get('c')) # None
map.remove('b')
print(map.get('b')) # None
Operation | HashSet | HashMap |
---|---|---|
Insert | O(1) | O(1) |
Delete | O(1) | O(1) |
Search | O(1) | O(1) |
操作 | HashSet | HashMap |
---|---|---|
插入 | O(1) | O(1) |
删除 | O(1) | O(1) |
搜索 | O(1) | O(1) |
-
Handling Collisions: In a hash map, collisions occur when two keys hash to the same index. Common strategies include chaining (using a linked list) and open addressing (finding another open slot).
-
处理冲突: 在哈希映射中,当两个键哈希到相同的索引时,会发生冲突。常见策略包括链地址法(使用链表)和开放寻址法(找到另一个空槽)。
-
Load Factor: Maintain an optimal load factor to ensure efficient performance. If the load factor exceeds a threshold, resizing the hash table may be necessary.
-
装载因子: 保持最佳装载因子以确保高效性能。如果装载因子超过阈值,可能需要调整哈希表的大小。
By understanding the implementation of HashSet and HashMap, you can efficiently handle problems involving uniqueness, counts, and frequency. 通过理解HashSet和HashMap的实现,你可以高效地处理涉及唯一性、计数和频率的问题。
The expression value % 1000
in the context of hashing functions is used to map a potentially large range of input values into a smaller range of indices suitable for array-based storage structures, such as the buckets in a hash table.
-
Bucket Limitation: The number 1000 represents the number of buckets in the hash table. This is often a chosen size for the hash table to store values efficiently. 数字1000表示哈希表中的桶数量。这通常是为高效存储值而选择的哈希表大小。
-
Range Reduction: Applying the modulus operator
%
reduces the range of hash values to between 0 and 999. This ensures that the hash values fit within the bounds of the array (or list) used to store the buckets. 应用取模运算符%
将哈希值的范围减少到 0 到 999 之间。这确保哈希值在用于存储桶的数组(或列表)的范围内。 -
Uniform Distribution: A good hash function aims to distribute values uniformly across the available buckets to minimize collisions. Using
% 1000
helps achieve this by spreading values over a fixed range. 一个好的哈希函数旨在将值均匀地分布在可用桶上,以尽量减少冲突。使用% 1000
有助于通过固定范围分布值来实现这一目标。
Suppose you have a large integer value, and you want to store it in a hash table with 1000 buckets. 假设你有一个大整数值,并且想将其存储在具有1000个桶的哈希表中。
class HashSet {
constructor() {
this.bucket = new Array(1000).fill(null).map(() => []);
}
// Hash function
// 哈希函数
hash(value) {
return value % 1000;
}
// Add value to the set
// 向集合中添加值
add(value) {
const index = this.hash(value);
if (!this.bucket[index].includes(value)) {
this.bucket[index].push(value);
}
}
// Check if value is in the set
// 检查值是否在集合中
has(value) {
const index = this.hash(value);
return this.bucket[index].includes(value);
}
// Remove value from the set
// 从集合中移除值
remove(value) {
const index = this.hash(value);
const bucket = this.bucket[index];
const valueIndex = bucket.indexOf(value);
if (valueIndex !== -1) {
bucket.splice(valueIndex, 1);
}
}
}
// Example Usage
// 示例用法
const set = new HashSet();
set.add(1001); // Will be stored in bucket index 1
set.add(2002); // Will be stored in bucket index 2
console.log(set.has(1001)); // true
console.log(set.has(3003)); // false
set.remove(2002);
console.log(set.has(2002)); // false
class HashSet:
def __init__(self):
self.bucket = [[] for _ in range(1000)]
# Hash function
# 哈希函数
def hash(self, value):
return value % 1000
# Add value to the set
# 向集合中添加值
def add(self, value):
index = self.hash(value)
if value not in self.bucket[index]:
self.bucket[index].append(value)
# Check if value is in the set
# 检查值是否在集合中
def contains(self, value):
index = self.hash(value)
return value in self.bucket[index]
# Remove value from the set
# 从集合中移除值
def remove(self, value):
index = self.hash(value)
if value in self.bucket[index]:
self.bucket[index].remove(value)
# Example Usage
# 示例用法
set = HashSet()
set.add(1001) # Will be stored in bucket index 1
set.add(2002) # Will be stored in bucket index 2
print(set.contains(1001)) # True
print(set.contains(3003)) # False
set.remove(2002)
print(set.contains(2002)) # False
-
Bucket Size: The choice of 1000 is arbitrary and can be adjusted based on the expected number of elements and desired load factor. 桶的大小: 1000的选择是任意的,可以根据预期的元素数量和所需的装载因子进行调整。
-
Load Factor: A higher number of buckets generally means fewer collisions, but it also means more memory usage. 装载因子: 更多的桶通常意味着更少的冲突,但也意味着更多的内存使用。
-
Collisions: Proper handling of collisions is crucial for maintaining the efficiency of hash-based structures. 冲突: 正确处理冲突对于保持基于哈希结构的效率至关重要。
By using the modulus operator to constrain the hash values within the bounds of the array, we ensure that all values can be appropriately indexed and stored. 通过使用取模运算符将哈希值限制在数组的范围内,我们确保所有值都可以适当地索引和存储。