MapReduce序列化（二）

2023-05-12 11:23:05 浏览数 (2)

三、使用Writable序列化数据

在MapReduce中，通常使用Writable序列化数据。在Mapper中，用户将输入数据解析为键值对，并将键值对转换为自定义的Writable对象。在Reducer中，用户将Writable对象转换为输出键值对。下面是一个简单的例子：

代码语言：javascript复制

public static class MyMapper extends Mapper<LongWritable, Text, Text, IntWritable> {

    private Text word = new Text();
    private IntWritable count = new IntWritable(1);

    public void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {
        String line = value.toString();
        String[] words = line.split(" ");
        for (String w : words) {
            word.set(w);
            context.write(word, count);
        }
    }
}

public static class MyReducer extends Reducer<Text, IntWritable, Text, IntWritable> {

    private IntWritable result = new IntWritable();

    public void reduce(Text key, Iterable<IntWritable> values, Context context) throws IOException, InterruptedException {
        int sum = 0;
        for (IntWritable val : values) {
            sum  = val.get();
        }
        result.set(sum);
        context.write(key, result);
    }
}

在这个例子中，MyMapper将每个单词转换为Text对象，并将值设置为IntWritable(1)，然后将Text和IntWritable对象写入Context中。在MyReducer中，将Text和Iterable<IntWritable>作为输入，并将它们转换为输出键值对。

hadoop

0 人点赞