三、使用Writable序列化数据
在MapReduce中,通常使用Writable序列化数据。在Mapper中,用户将输入数据解析为键值对,并将键值对转换为自定义的Writable对象。在Reducer中,用户将Writable对象转换为输出键值对。下面是一个简单的例子:
代码语言:javascript复制public static class MyMapper extends Mapper<LongWritable, Text, Text, IntWritable> {
private Text word = new Text();
private IntWritable count = new IntWritable(1);
public void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {
String line = value.toString();
String[] words = line.split(" ");
for (String w : words) {
word.set(w);
context.write(word, count);
}
}
}
public static class MyReducer extends Reducer<Text, IntWritable, Text, IntWritable> {
private IntWritable result = new IntWritable();
public void reduce(Text key, Iterable<IntWritable> values, Context context) throws IOException, InterruptedException {
int sum = 0;
for (IntWritable val : values) {
sum = val.get();
}
result.set(sum);
context.write(key, result);
}
}
在这个例子中,MyMapper将每个单词转换为Text对象,并将值设置为IntWritable(1),然后将Text和IntWritable对象写入Context中。在MyReducer中,将Text和Iterable<IntWritable>作为输入,并将它们转换为输出键值对。