Hadoop框架:MapReduce基本原理和入门案例( 二 )


Hadoop序列化相关接口:Writable实现的序列化机制、Comparable管理Key的排序问题;
2、案例实现案例描述:读取文件 , 并对文件相同的行做数据累加计算 , 输出计算结果;该案例演示在本地执行 , 不把Jar包上传的hadoop服务器 , 驱动配置一致 。
实体对象属性
public class AddEntity implements Writable {private long addNum01;private long addNum02;private long resNum;// 构造方法public AddEntity() {super();}public AddEntity(long addNum01, long addNum02) {super();this.addNum01 = addNum01;this.addNum02 = addNum02;this.resNum = addNum01 + addNum02;}// 序列化@Overridepublic void write(DataOutput dataOutput) throws IOException {dataOutput.writeLong(addNum01);dataOutput.writeLong(addNum02);dataOutput.writeLong(resNum);}// 反序列化@Overridepublic void readFields(DataInput dataInput) throws IOException {// 注意:反序列化顺序和写序列化顺序一致this.addNum01= dataInput.readLong();this.addNum02 = dataInput.readLong();this.resNum = dataInput.readLong();}// 省略Get和Set方法}Mapper机制
public class AddMapper extends Mapper<LongWritable, Text, Text, AddEntity> {Text myKey = new Text();@Overrideprotected void map(LongWritable key, Text value, Context context)throws IOException, InterruptedException {// 读取行String line = value.toString();// 行内容切割String[] lineArr = line.split(",");// 内容格式处理String lineNum = lineArr[0];long addNum01 = Long.parseLong(lineArr[1]);long addNum02 = Long.parseLong(lineArr[2]);myKey.set(lineNum);AddEntity myValue = https://www.isolves.com/it/cxkf/kj/2020-11-23/new AddEntity(addNum01,addNum02);// 输出context.write(myKey, myValue);}}Reducer机制
public class AddReducer extends Reducer<Text, AddEntity, Text, AddEntity> {@Overrideprotected void reduce(Text key, Iterable<AddEntity> values, Context context)throws IOException, InterruptedException {long addNum01Sum = 0;long addNum02Sum = 0;// 处理Key相同for (AddEntity addEntity : values) {addNum01Sum += addEntity.getAddNum01();addNum02Sum += addEntity.getAddNum02();}// 最终输出AddEntity addRes = new AddEntity(addNum01Sum, addNum02Sum);context.write(key, addRes);}}案例最终结果:

Hadoop框架:MapReduce基本原理和入门案例

文章插图

【Hadoop框架:MapReduce基本原理和入门案例】


推荐阅读