hadoop的默认分区器：hashpartitioner-它如何计算一个键的哈希代码？

3okqufwl 于 2021-06-01 发布在 Hadoop

关注(0)|答案(1)|浏览(364)

我在努力理解 partitioning 在 MapReduce 我知道hadoop有一个默认的分区器，叫做 HashPartitioner ，而partitioner有助于决定给定键将转到哪个reducer。

从概念上讲，它是这样工作的：

hashcode(key) % NumberOfReducers, where `key` is the key in <key,value> pair.

我的问题是：

你怎么知道的 HashPartitioner 计算密钥的哈希码？只需调用键的hashcode（），或者 HashPartitioner 使用其他逻辑来计算密钥的哈希码？
有人能帮我理解吗？

hadoop mapreduce hadoop2 reducers HashCode

来源：https://stackoverflow.com/questions/49180132/hadoops-default-partitioner-hashpartitioner-how-it-calculates-hash-code-of-a

1条答案

按热度按时间

pu82cl6c1#

默认的分区器只需使用 hashcode() 方法并计算分区。这给了你一个实施你的计划的机会 hascode() 调整键的分区方式。
从javadoc：

public int getPartition(K key,
               V value,
               int numReduceTasks)
Use Object.hashCode() to partition.

对于实际代码，它只返回 (key.hashCode() & Integer.MAX_VALUE) % numReduceTasks :

public int More ...getPartition(K key, V value,
                          int numReduceTasks) {
    return (key.hashCode() & Integer.MAX_VALUE) % numReduceTasks;
 }

编辑：在自定义分区器上添加详细信息
您可以为partitioner添加一个不同的逻辑，它甚至可能不使用 hashcode() 完全。
因为自定义分区器可以通过扩展 Partitioner ```
public class CustomPartitioner extends Partitioner<Text, Text>

一个这样的示例，用于自定义键对象的属性：

public static class CustomPartitioner extends Partitioner<Text, Text>{
@Override
public int getPartition(Text key, Text value, int numReduceTasks){
String emp_dept = key.getDepartment();
if(numReduceTasks == 0){
return 0;
}

if(key.equals(new Text(“IT”))){
    return 0;
}else if(key.equals(new Text(“Admin”))){
    return 1 % numReduceTasks;
}else{
    return 2 % numReduceTasks;
}

}

赞(0）回复(0）举报 2021-06-01

我来回答

hadoop的默认分区器：hashpartitioner-它如何计算一个键的哈希代码？

从概念上讲，它是这样工作的：

我的问题是：

1条答案

相关问题

热门标签

最新问答